Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrd existfile aborts (4.8.1-1.el6) #680

Closed
jmuf opened this issue Apr 4, 2018 · 8 comments
Closed

xrd existfile aborts (4.8.1-1.el6) #680

jmuf opened this issue Apr 4, 2018 · 8 comments

Comments

@jmuf
Copy link
Contributor

jmuf commented Apr 4, 2018

Don't have the actual "abort" message, but I guess this was something about the array-like access to vb[0]?

Core was generated by `xrd eosuser.cern.ch existfile /eos/user/C/CHANGED/atlas/h019/mc16a/stage/LowHig'.
Program terminated with signal 6, Aborted.
(gdb) bt
#0  0x00007fb2d97ac495 in raise () from /lib64/libc.so.6
#1  0x00007fb2d97adc75 in abort () from /lib64/libc.so.6
#2  0x00000000004095c6 in At (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdClientVector.hh:306
#3  operator[] (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdClientVector.hh:312
#4  executeExistFile (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdCommandLine.cc:976
#5  0x000000000040cb77 in main (argc=<value optimized out>, argv=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdCommandLine.cc:1943
(gdb) f 2
#2  0x00000000004095c6 in At (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdClientVector.hh:306
306	                             static_cast<unsigned long>(size))) abort();
(gdb) f 4
#4  executeExistFile (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdCommandLine.cc:976
976	        if (vb[0] && (vb.GetSize() >= 1))
(gdb) print vb
$1 = {sizeof_t = 4, rawdata = 0x1e7a8c0 "x\203\260ٲ\177", index = 0x1e7afb0, holecount = 0, size = 0, mincap = 128, capacity = 128, maxsize = 128}
(gdb) print vb[0]
Could not find operator[].
(gdb) f 2
#2  0x00000000004095c6 in At (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdClientVector.hh:306
306	                             static_cast<unsigned long>(size))) abort();
(gdb) l
303	    // Bounded array like access
304	    inline T &At(int pos) {
305	           if ((pos < 0) || (static_cast<unsigned long>(pos) >=
306	                             static_cast<unsigned long>(size))) abort();
307	
308		return *( reinterpret_cast<T*>(rawdata + index[pos].offs));
309	    }
310	

Version:
pkg_name: xrootd-client
pkg_release: 1.el6
pkg_vendor: Scientific Linux CERN, http://cern.ch/linux
pkg_version: 4.8.1

var_log_messages:
:Apr 3 20:16:32 lxplus020 abrt[16040]: Saved core dump of pid 16037 (/usr/bin/xrd) to /var/spool/abrt/ccpp-2018-04-03-20:16:31-16037 (27258880 bytes)
:Apr 3 20:16:32 lxplus020 abrt[16046]: Not saving repeating crash in '/usr/bin/xrd'
:Apr 3 20:16:32 lxplus020 abrt[16050]: Not saving repeating crash in '/usr/bin/xrd'
:Apr 3 20:16:32 lxplus020 abrt[16054]: Not saving repeating crash in '/usr/bin/xrd'

lxplus020 $ repoquery --location xrootd-client
http://linuxsoft.cern.ch/internal/repos/eos6-stable/x86_64/os/Packages/xrootd-client-4.8.1-1.el6.x86_64.rpm

@jmuf
Copy link
Contributor Author

jmuf commented Apr 4, 2018

Same backtrace (vb again has size = 0), same version, different machine, user and endpoint:

Core was generated by `xrd cms-xrd-global.cern.ch'.
Program terminated with signal 6, Aborted.
#0  0x00007ff5d2c2f495 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 libgcc-4.4.7-18.el6_9.2.x86_64 libstdc++-4.4.7-18.el6_9.2.x86_64 ncurses-libs-5.7-4.20090207.el6.x86_64 readline-6.0-4.el6.x86_64
(gdb) bt
#0  0x00007ff5d2c2f495 in raise () from /lib64/libc.so.6
#1  0x00007ff5d2c30c75 in abort () from /lib64/libc.so.6
#2  0x00000000004095c6 in At (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdClientVector.hh:306
#3  operator[] (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdClientVector.hh:312
#4  executeExistFile (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdCommandLine.cc:976
#5  0x000000000040cb77 in main (argc=<value optimized out>, argv=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdCommandLine.cc:1943
(gdb) f 4
#4  executeExistFile (tkzer=<value optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdClient/XrdCommandLine.cc:976
976	        if (vb[0] && (vb.GetSize() >= 1))
(gdb) print vb
$1 = {sizeof_t = 4, rawdata = 0x1f51130 "", index = 0x1f51340, holecount = 0, size = 0, mincap = 128, capacity = 128, maxsize = 128}

~$ rpm -qf /usr/bin/xrd
1:xrootd-client-4.8.1-1.el6.x86_64

@ljanyst
Copy link
Contributor

ljanyst commented Apr 4, 2018

Jan, this stuff was deprecated way before I left.

@bbockelm
Copy link
Contributor

bbockelm commented Apr 5, 2018

@jmuf - put another way: how can we help you move to a newer client? We'd love to help out...

@jmuf
Copy link
Contributor Author

jmuf commented Apr 5, 2018

The question is whether anybody has told the users that this command would be deprecated (or perhaps be replaced by "xrdfs")?

  • "xrd" is part of "xrootd-client" (i.e not some "-legacy" package name)
  • "xrd" does not emit any "I am deprecated, use 'xrdfs'"-warnings at runtime
  • "man xrd" does not indicate this is deprecated either clearly marks this as being DEPRECATED
  • https://github.com/xrootd/xrootd/blob/v4.8.1/docs/ReleaseNotes.txt does not seem to mention "xrd"

Even if all of these were different, we'd still have to deal with hardcoded legacy usage (e.g by experiments) and negotiate a phaseout (the crashes seem to have been from interactive use).
But for now I don't think the average user has any idea that they should not use this command. For "xrdcp" (which I guess was in a similar situation), the new "xrdcopy" became again "xrdcp" in 4.0, but not so for "xrd"/"xrdfs".

(just to clarify: I've seen these crashes happen to other users on the CERN LXPLUS cluster. I am not using xrd myself)

@ljanyst
Copy link
Contributor

ljanyst commented Apr 5, 2018

Yes, all this was communicated every time I got an opportunity to speak publicly, which was quite a lot. Not putting a deprecation message in xrd was clearly a mistake. On one hand, I guess, nobody talks about these tools being deprecated anymore, so people forget. On the other hand, these tools should have been removed long ago. Michal told me he did not do it because of packaging/EPEL issues.

@simonmichal
Copy link
Contributor

My understanding is that xrd has been deprecated since version 4.0.0 and is definitely no longer supported (I think this has been also announced on the xrootd mailing list). Due to the way the old client has been packaged we can only remove it on a major release.

I'll add a warning, update man pages and prepare a separate announcement with a remainder that the old client is not to be used.

@bbockelm
Copy link
Contributor

bbockelm commented Apr 5, 2018

Could we also not build it by default in future release (i.e., require a flag to be passed for it to actually be built and shipped)? That would provide yet another nudge in the right direction.

@simonmichal
Copy link
Contributor

The DEPRECATION notes are present in xrd and xrdcp-old man pages:
https://github.com/xrootd/xrootd/blob/master/docs/man/xrd.1#L14-L15
https://github.com/xrootd/xrootd/blob/master/docs/man/xrdcp-old.1#L24-L25

I have added warnings to the binaries in:
e341f47

so I think I can safely close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants