Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python xrootd 5.4.0 el7:: double free or corruption #1576

Closed
adriansev opened this issue Dec 15, 2021 · 9 comments
Closed

python xrootd 5.4.0 el7:: double free or corruption #1576

adriansev opened this issue Dec 15, 2021 · 9 comments
Assignees

Comments

@adriansev
Copy link
Contributor

adriansev commented Dec 15, 2021

Hi! Trying to download with 32 parallel (1 thread copy worked) copies on centos 7 with devtoolset 7 enabled i got this:

*** Error in `python3': double free or corruption (fasttop): 0x00007fce780022b0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7fcf64caa329]
/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9e)[0x7fcf5603a07e]
/home/adrian/.local/lib/python3.6/site-packages/pyxrootd/lib64/libXrdCl.so.3(+0x168ff5)[0x7fcf5670dff5]
/home/adrian/.local/lib/python3.6/site-packages/pyxrootd/lib64/libXrdCl.so.3(_ZN5XrdCl14ClassicCopyJob3RunEPNS_19CopyProgressHandlerE+0x245c)[0x7fcf5671d8dc]
/home/adrian/.local/lib/python3.6/site-packages/pyxrootd/lib64/libXrdCl.so.3(+0x15f220)[0x7fcf56704220]
/home/adrian/.local/lib/python3.6/site-packages/pyxrootd/lib64/libXrdCl.so.3(_ZN5XrdCl10JobManager7RunJobsEv+0x9d)[0x7fcf56737e6d]
/home/adrian/.local/lib/python3.6/site-packages/pyxrootd/lib64/libXrdCl.so.3(+0x192ee9)[0x7fcf56737ee9]
/lib64/libpthread.so.0(+0x7ea5)[0x7fcf65707ea5]
/lib64/libc.so.6(clone+0x6d)[0x7fcf64d27b0d]

full log https://asevcenc.web.cern.ch/asevcenc/xrootd_python_2free/error.txt

@simonmichal simonmichal self-assigned this Dec 15, 2021
@simonmichal
Copy link
Contributor

@adriansev : thanks for reporting this problem, is this reproducible or is it one off (please be it reproducible)?

@simonmichal
Copy link
Contributor

@adriansev : do you maybe have the client side logs?

@adriansev
Copy link
Contributor Author

@simonmichal sorry for the late answer! so, most of the time i just have download errors in the logs with some timeouts like [Error ][XRootDTransport ] Message 0x5a00eed0, stream [224, 0] is a response that we're no longer interested in (timed out).. but from time to time i get also the segmentation fault
see https://asevcenc.web.cern.ch/asevcenc/xrootd_python_2free/xrdlog_segfault.txt
i will try to get a backtrace to when such fault occurs

@adriansev
Copy link
Contributor Author

@simonmichal so far i got this:

Thread 351 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9bfbf700 (LWP 14989)]
__memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:2819
2819            movdqu  0x10(%rsi), %xmm1
(gdb) bt
#0  __memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:2819
#1  0x00007fffe81fdea0 in std::char_traits<char>::copy (__n=16777215, __s2=0x7ffe67004148 "", __s1=<optimized out>)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/char_traits.h:271
#2  std::basic_streambuf<char, std::char_traits<char> >::xsputn (this=0x7fff9bfbe868, __s=0x7ffe67004148 "", __n=140730576143752)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/streambuf.tcc:90
#3  0x00007fffe81f4bb5 in std::basic_streambuf<char, std::char_traits<char> >::sputn (__n=140730576143752, __s=0x7ffe66004188 "`A", this=<optimized out>)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/streambuf:451
#4  std::__ostream_write<char, std::char_traits<char> > (__n=140730576143752, __s=0x7ffe66004188 "`A", __out=...)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:50
#5  std::__ostream_insert<char, std::char_traits<char> > (__out=..., __s=0x7ffe66004188 "`A", __n=140730576143752)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:101
#6  0x00007fffe88eed93 in std::operator<< <char, std::char_traits<char>, std::allocator<char> > (__str=..., __os=...)
at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/basic_string.h:6277
#7  XrdCl::PropertyList::Set<XrdCl::XRootDStatus> (this=this@entry=0xdb1550, name="status", item=...) at /home/adrian/work-GRID/xrootd/src/./XrdCl/XrdClPropertyList.hh:219
#8  0x00007fffe88e7291 in (anonymous namespace)::QueuedCopyJob::Run (this=<optimized out>) at /home/adrian/work-GRID/xrootd/src/XrdCl/XrdClCopyProcess.cc:162
#9  0x00007fffe891ae6d in XrdCl::JobManager::RunJobs (this=0x7fffffffb390) at /home/adrian/work-GRID/xrootd/src/XrdCl/XrdClJobManager.cc:153
#10 0x00007fffe891aee9 in RunRunnerThread (arg=<optimized out>) at /home/adrian/work-GRID/xrootd/src/XrdCl/XrdClJobManager.cc:34
#11 0x00007ffff769eea5 in start_thread (arg=0x7fff9bfbf700) at pthread_create.c:307
#12 0x00007ffff6cbeb0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

@adriansev
Copy link
Contributor Author

so, i added -g -ggdb to setup.py.in and the output maybe have more info:

Thread 315 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffadfe3700 (LWP 22194)]
__memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:2928
2928            movdqu  -0x20(%rsi), %xmm1
(gdb) bt
#0  __memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:2928
#1  0x00007fffe81fdea0 in std::char_traits<char>::copy (__n=33554431, __s2=0x7ffee20021b8 "", __s1=<optimized out>)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/char_traits.h:271
#2  std::basic_streambuf<char, std::char_traits<char> >::xsputn (this=0x7fffadfe2868, __s=0x7ffee20021b8 "", __n=140732656523360)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/streambuf.tcc:90
#3  0x00007fffe81f4bb5 in std::basic_streambuf<char, std::char_traits<char> >::sputn (__n=140732656523360, __s=0x7ffee00021f8 "xrdcl.in0", this=<optimized out>)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/streambuf:451
#4  std::__ostream_write<char, std::char_traits<char> > (__n=140732656523360, __s=0x7ffee00021f8 "xrdcl.in0", __out=...)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:50
#5  std::__ostream_insert<char, std::char_traits<char> > (__out=..., __s=0x7ffee00021f8 "xrdcl.in0", __n=140732656523360)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:101
#6  0x00007fffe88eed93 in std::operator<< <char, std::char_traits<char>, std::allocator<char> > (__str=..., __os=...)
at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/basic_string.h:6277
#7  XrdCl::PropertyList::Set<XrdCl::XRootDStatus> (this=this@entry=0xd150c0, name="status", item=...) at /home/adrian/work-GRID/xrootd/src/./XrdCl/XrdClPropertyList.hh:219
#8  0x00007fffe88e7291 in (anonymous namespace)::QueuedCopyJob::Run (this=<optimized out>) at /home/adrian/work-GRID/xrootd/src/XrdCl/XrdClCopyProcess.cc:162
#9  0x00007fffe891ae6d in XrdCl::JobManager::RunJobs (this=0x7fffffffb3f0) at /home/adrian/work-GRID/xrootd/src/XrdCl/XrdClJobManager.cc:153
#10 0x00007fffe891aee9 in RunRunnerThread (arg=<optimized out>) at /home/adrian/work-GRID/xrootd/src/XrdCl/XrdClJobManager.cc:34
#11 0x00007ffff769eea5 in start_thread (arg=0x7fffadfe3700) at pthread_create.c:307
#12 0x00007ffff6cbeb0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

@adriansev
Copy link
Contributor Author

also, just for info, i can trigger this 3 times from 5 (i think that depends on some network conditions)

@simonmichal
Copy link
Contributor

@adriansev : sorry for the the delay, I still cannot reproduce the problem, is there any chance you could run with valgrind, or alternatively, could you build xrootd with -DENABLE_ASAN=ON cmake flag?

@adriansev
Copy link
Contributor Author

@simonmichal i will try it and get back with feedback when done. thanks a lot

@adriansev
Copy link
Contributor Author

so, i could not reproduce this with 5.4.2 with any load that i tried .. if if encounter anything similar i will open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants