Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrootdfs crash #11

Closed
dougbenjamin opened this issue May 1, 2013 · 6 comments
Closed

xrootdfs crash #11

dougbenjamin opened this issue May 1, 2013 · 6 comments

Comments

@dougbenjamin
Copy link

Hi,

We are seeing this error in the system logs

May 1 14:54:48 atl008 kernel: xrootdfs[3249] trap divide error rip:336920819a rsp:2af4fdc67710 error:0

for a static mount xrootdfs mount point -
xrootdfs /atlfs03/atlas fuse rdr=root://atlfs03.phy.duke.edu:1094//atlas,uid=54657 0 0

[root@atl008 ~]# yum list installed xrootd-fuse
Loaded plugins: downloadonly, kernel-module
Excluding Packages from Extra Packages for Enterprise Linux 5 - x86_64
Finished
Excluding Packages from VDT RPM repository - development versions for Redhat Enterprise Linux 5 and compatible
Finished
Installed Packages
xrootd-fuse.x86_64 1:3.3.1-1.slc5.xu installed

Here is the OS details -

[benjamin@atl008 ~]$ uname -a
Linux atl008.phy.duke.edu 2.6.18-348.4.1.el5 #1 SMP Tue Apr 16 15:42:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
[benjamin@atl008 ~]$ cat /etc/redhat-release
Scientific Linux release 5.9 (Boron)

Any ideas how I can get around this issue.

Thanks,

Doug

@xrootd-dev
Copy link

Hi Doug,

I don't have an idea at this point but have two questions: is there a core file left behind? how often does this happen?

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

@dougbenjamin
Copy link
Author

Hi,

More often than I would like. It appears to happen regularly. Not
sure how I can capture the core dump, when
the mount is stated from /etc/fstab (or autofs for that mater). Any
clues to get the core would be helpful

Thanks,

Doug

On 05/01/2013 02:21 PM, xrootd-dev wrote:

Hi Doug,

I don't have an idea at this point but have two questions: is there a
core file left behind? how often does this happen?

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)


Reply to this email directly or view it on GitHub
#11 (comment).

@xrootd-dev
Copy link

I supposed you use a script to start xrootdfs in autofs? (you can use the script in /etc/fstab as well). in the script, can you add

cd /tmp
ulimit -c unlimited

regards,
Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On May 1, 2013, at 12:37 PM, dougbenjamin wrote:

Hi,

More often than I would like. It appears to happen regularly. Not
sure how I can capture the core dump, when
the mount is stated from /etc/fstab (or autofs for that mater). Any
clues to get the core would be helpful

Thanks,

Doug

@dougbenjamin
Copy link
Author

How does one use a script in /etc/fstab ? I ask because I see that we are having trouble getting the automounter to remount xrootdfs mounts after they have been unmounted when running under SL6 (???!!!???)

@wyang007
Copy link
Member

wyang007 commented Sep 3, 2013

xrootdfs /xrootd/atlas fuse rdr=root://atl-xrdr:11094//atlas/xrootd,uid=atldq2 1 2

(the first xrootdfs can also be a script)

Wei Yang | yangw@slac.stanford.edu | 650-926-3338(O)

On Sep 2, 2013, at 3:25 PM, dougbenjamin notifications@github.com wrote:

How does one use a script in /etc/fstab ? I ask because I see that we are having trouble getting the automounter to remount xrootdfs mounts after they have been unmounted when running under SL6 (???!!!???)


Reply to this email directly or view it on GitHub.

@abh3
Copy link
Member

abh3 commented Dec 3, 2014

I don't think this has been reproduced as of late. Has it? I'll reopen this if this is still a problem.

@abh3 abh3 closed this as completed Dec 3, 2014
@ljanyst ljanyst added the fixed label Dec 5, 2014
gbitzes added a commit to gbitzes/xrootd that referenced this issue Nov 15, 2017
cP->dlType is being read outside of the lock. Diagnosed through the
following report from ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=13166)
  Write of size 1 at 0x7b28000100d0 by thread T29 (mutexes: write M0, write M0, write M1382, write M0, write M1385):
    #0 XrdSys::IOEvents::Poller::TmoAdd(XrdSys::IOEvents::Channel*, int) /home/gbitzes/xrootd/src/XrdSys/XrdSysIOEvents.cc:1088 (libXrdUtils.so.2+0x0000000338f1)
    xrootd#1 XrdSys::IOEvents::Channel::Enable(int, int, char const**) /home/gbitzes/xrootd/src/XrdSys/XrdSysIOEvents.cc:415 (libXrdUtils.so.2+0x0000000355b6)
    xrootd#2 XrdCl::PollerBuiltIn::EnableWriteNotification(XrdCl::Socket*, bool, unsigned short) /home/gbitzes/xrootd/src/XrdCl/XrdClPollerBuiltIn.cc:481 (libXrdCl.so.2+0x000000063c50)
    xrootd#3 XrdCl::AsyncSocketHandler::EnableUplink() /home/gbitzes/xrootd/src/./XrdCl/XrdClAsyncSocketHandler.hh:96 (libXrdCl.so.2+0x00000006d06f)
    xrootd#4 XrdCl::Stream::EnableLink(XrdCl::PathID&) /home/gbitzes/xrootd/src/XrdCl/XrdClStream.cc:226 (libXrdCl.so.2+0x00000006d06f)
    xrootd#5 XrdCl::Stream::Send(XrdCl::Message*, XrdCl::OutgoingMsgHandler*, bool, long) /home/gbitzes/xrootd/src/XrdCl/XrdClStream.cc:316 (libXrdCl.so.2+0x00000006e0d7)
    xrootd#6 XrdCl::Channel::Send(XrdCl::Message*, XrdCl::OutgoingMsgHandler*, bool, long, XrdCl::VirtualRedirector*) /home/gbitzes/xrootd/src/XrdCl/XrdClChannel.cc:306 (libXrdCl.so.2+0x000000068686)
    xrootd#7 XrdCl::PostMaster::Send(XrdCl::URL const&, XrdCl::Message*, XrdCl::OutgoingMsgHandler*, bool, long) /home/gbitzes/xrootd/src/XrdCl/XrdClPostMaster.cc:198 (libXrdCl.so.2+0x000000066ec9)
    xrootd#8 XrdCl::MessageUtils::SendMessage(XrdCl::URL const&, XrdCl::Message*, XrdCl::ResponseHandler*, XrdCl::MessageSendParams const&) /home/gbitzes/xrootd/src/XrdCl/XrdClMessageUtils.cc:114 (libXrdCl.so.2+0x0000000b349e)
    xrootd#9 XrdCl::FileSystem::Send(XrdCl::Message*, XrdCl::ResponseHandler*, XrdCl::MessageSendParams&) /home/gbitzes/xrootd/src/XrdCl/XrdClFileSystem.cc:1419 (libXrdCl.so.2+0x000000085b3d)
    xrootd#10 XrdCl::FileSystem::Query(XrdCl::QueryCode::Code, XrdCl::Buffer const&, XrdCl::ResponseHandler*, unsigned short) /home/gbitzes/xrootd/src/XrdCl/XrdClFileSystem.cc:720 (libXrdCl.so.2+0x00000008fbe3)
    xrootd#11 XrdCl::FileSystem::Query(XrdCl::QueryCode::Code, XrdCl::Buffer const&, XrdCl::Buffer*&, unsigned short) /home/gbitzes/xrootd/src/XrdCl/XrdClFileSystem.cc:732 (libXrdCl.so.2+0x00000009006e)
    xrootd#12 backend::Query(XrdCl::FileSystem&, XrdCl::QueryCode::Code, XrdCl::Buffer&, XrdCl::Buffer*&) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/backend/backend.cc:875 (eosxd+0x00000067fb9f)
    xrootd#13 backend::putMD(fuse_id const&, eos::fusex::md*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, XrdSysMutex*) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/backend/backend.cc:454 (eosxd+0x00000068912e)
    xrootd#14 metad::mdcflush(ThreadAssistant&) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/md/md.cc:1887 (eosxd+0x000000611515)
    xrootd#15 void std::__invoke_impl<void, void (metad::*)(ThreadAssistant&), metad*, ThreadAssistant&>(std::__invoke_memfun_deref, void (metad::*&&)(ThreadAssistant&), metad*&&, ThreadAssistant&) /usr/include/c++/7/bits/invoke.h:73 (eosxd+0x0000005b1fb6)
    xrootd#16 std::__invoke_result<void (metad::*)(ThreadAssistant&), metad*, ThreadAssistant&>::type std::__invoke<void (metad::*)(ThreadAssistant&), metad*, ThreadAssistant&>(void (metad::*&&)(ThreadAssistant&), metad*&&, ThreadAssistant&) /usr/include/c++/7/bits/invoke.h:95 (eosxd+0x0000005b1fb6)
    xrootd#17 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)(), (_S_declval<2ul>)())) std::thread::_Invoker<std::tuple<void (metad::*)(ThreadAssistant&), metad*, ThreadAssistant&> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/7/thread:234 (eosxd+0x0000005b1fb6)
    xrootd#18 std::thread::_Invoker<std::tuple<void (metad::*)(ThreadAssistant&), metad*, ThreadAssistant&> >::operator()() /usr/include/c++/7/thread:243 (eosxd+0x0000005b1fb6)
    xrootd#19 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (metad::*)(ThreadAssistant&), metad*, ThreadAssistant&> > >::_M_run() /usr/include/c++/7/thread:186 (eosxd+0x0000005b1fb6)
    xrootd#20 <null> <null> (libstdc++.so.6+0x0000000bc01e)

  Previous read of size 1 at 0x7b28000100d0 by thread T34:
    #0 XrdSys::IOEvents::Poller::CbkTMO() /home/gbitzes/xrootd/src/XrdSys/XrdSysIOEvents.cc:597 (libXrdUtils.so.2+0x000000034d8a)
    xrootd#1 XrdSys::IOEvents::Poller::TmoGet() /home/gbitzes/xrootd/src/XrdSys/XrdSysIOEvents.cc:1150 (libXrdUtils.so.2+0x000000035166)
    xrootd#2 XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) /home/gbitzes/xrootd/src/./XrdSys/XrdSysIOEventsPollE.icc:213 (libXrdUtils.so.2+0x000000036787)
    xrootd#3 XrdSys::IOEvents::BootStrap::Start(void*) /home/gbitzes/xrootd/src/XrdSys/XrdSysIOEvents.cc:131 (libXrdUtils.so.2+0x000000030c5c)
    xrootd#4 XrdSysThread_Xeq /home/gbitzes/xrootd/src/XrdSys/XrdSysPthread.cc:86 (libXrdUtils.so.2+0x00000002d50f)
    xrootd#5 <null> <null> (libtsan.so.0+0x0000000257eb)

  Location is heap block of size 152 at 0x7b2800010040 allocated by thread T33:
    #0 operator new(unsigned long) <null> (libtsan.so.0+0x00000006f766)
    xrootd#1 XrdCl::PollerBuiltIn::AddSocket(XrdCl::Socket*, XrdCl::SocketHandler*) /home/gbitzes/xrootd/src/XrdCl/XrdClPollerBuiltIn.cc:295 (libXrdCl.so.2+0x000000064801)
    xrootd#2 XrdCl::AsyncSocketHandler::Connect(long) /home/gbitzes/xrootd/src/XrdCl/XrdClAsyncSocketHandler.cc:167 (libXrdCl.so.2+0x000000114155)
    xrootd#3 XrdCl::Stream::EnableLink(XrdCl::PathID&) /home/gbitzes/xrootd/src/XrdCl/XrdClStream.cc:271 (libXrdCl.so.2+0x00000006db5c)
    xrootd#4 XrdCl::Stream::Send(XrdCl::Message*, XrdCl::OutgoingMsgHandler*, bool, long) /home/gbitzes/xrootd/src/XrdCl/XrdClStream.cc:316 (libXrdCl.so.2+0x00000006e0d7)
    xrootd#5 XrdCl::Channel::Send(XrdCl::Message*, XrdCl::OutgoingMsgHandler*, bool, long, XrdCl::VirtualRedirector*) /home/gbitzes/xrootd/src/XrdCl/XrdClChannel.cc:306 (libXrdCl.so.2+0x000000068686)
    xrootd#6 XrdCl::PostMaster::Send(XrdCl::URL const&, XrdCl::Message*, XrdCl::OutgoingMsgHandler*, bool, long) /home/gbitzes/xrootd/src/XrdCl/XrdClPostMaster.cc:198 (libXrdCl.so.2+0x000000066ec9)
    xrootd#7 XrdCl::MessageUtils::SendMessage(XrdCl::URL const&, XrdCl::Message*, XrdCl::ResponseHandler*, XrdCl::MessageSendParams const&) /home/gbitzes/xrootd/src/XrdCl/XrdClMessageUtils.cc:114 (libXrdCl.so.2+0x0000000b349e)
    xrootd#8 XrdCl::FileStateHandler::Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned short, unsigned short, XrdCl::ResponseHandler*, unsigned short) /home/gbitzes/xrootd/src/XrdCl/XrdClFileStateHandler.cc:525 (libXrdCl.so.2+0x0000000c6d3c)
    xrootd#9 XrdCl::File::Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, XrdCl::OpenFlags::Flags, XrdCl::Access::Mode, XrdCl::ResponseHandler*, unsigned short) /home/gbitzes/xrootd/src/XrdCl/XrdClFile.cc:108 (libXrdCl.so.2+0x0000000b9929)
    xrootd#10 XrdCl::File::Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, XrdCl::OpenFlags::Flags, XrdCl::Access::Mode, unsigned short) /home/gbitzes/xrootd/src/XrdCl/XrdClFile.cc:120 (libXrdCl.so.2+0x0000000b9a8b)
    xrootd#11 backend::fetchResponse(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<eos::fusex::container, std::allocator<eos::fusex::container> >&) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/backend/backend.cc:220 (eosxd+0x0000006860e5)
    xrootd#12 backend::getMD(fuse_req*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<eos::fusex::container, std::allocator<eos::fusex::container> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/backend/backend.cc:152 (eosxd+0x000000687dd2)
    xrootd#13 metad::get(fuse_req*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, std::shared_ptr<metad::mdx>, char const*, bool) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/md/md.cc:576 (eosxd+0x00000061b6a5)
    xrootd#14 metad::lookup(fuse_req*, unsigned long, char const*) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/md/md.cc:176 (eosxd+0x00000061cd61)
    xrootd#15 EosFuse::lookup(fuse_req*, unsigned long, char const*) /afs/cern.ch/user/g/gbitzes/dev/eos/fusex/eosfuse.cc:1619 (eosxd+0x00000058c343)
    xrootd#16 <null> <null> (libfuse.so.2+0x000000016042)

[ ... ]

SUMMARY: ThreadSanitizer: data race /home/gbitzes/xrootd/src/XrdSys/XrdSysIOEvents.cc:1088 in XrdSys::IOEvents::Poller::TmoAdd(XrdSys::IOEvents::Channel*, int)
amadio pushed a commit to amadio/xrootd that referenced this issue May 12, 2023
Set the plugin version to the required number
amadio pushed a commit to amadio/xrootd that referenced this issue May 17, 2023
Change looks good to me.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants