Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaks from xrootd #1424

Closed
andreypz opened this issue Mar 11, 2021 · 5 comments
Closed

Memory leaks from xrootd #1424

andreypz opened this issue Mar 11, 2021 · 5 comments

Comments

@andreypz
Copy link

Hello,
My program use a lot of memory and I decided to check it with valgrind.
What I get are lots of errors from xrootd, like this:

==2385319== Conditional jump or move depends on uninitialised value(s)
==2385319==    at 0x88E6153: BN_ucmp (bn_lib.c:673)
==2385319==    by 0x88E3775: BN_div (bn_div.c:318)
==2385319==    by 0x88EC85B: probable_prime_dh_safe (bn_prime.c:476)
==2385319==    by 0x88EC85B: BN_generate_prime_ex (bn_prime.c:186)
==2385319==    by 0x891297D: dh_builtin_genparams (dh_gen.c:194)
==2385319==    by 0x891297D: DH_generate_parameters_ex (dh_gen.c:88)
==2385319==    by 0x87C1A38: XrdCryptosslCipher::XrdCryptosslCipher(bool, int, char*, int, char const*) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/\
xrootd/4.12.3/lib64/libXrdCryptossl-4.so)
==2385319==    by 0x87CE3D5: XrdCryptosslFactory::Cipher(int, char*, int, char const*) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib\
64/libXrdCryptossl-4.so)
==2385319==    by 0x8772962: XrdSecProtocolgsi::ParseCrypto(XrdOucString) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdSecg\
si-4.so)
==2385319==    by 0x8780DFD: XrdSecProtocolgsi::ClientDoInit(XrdSutBuffer*, XrdSutBuffer**, XrdOucString&) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/extern\
al/xrootd/4.12.3/lib64/libXrdSecgsi-4.so)
==2385319==    by 0x87814A4: XrdSecProtocolgsi::ParseClientInput(XrdSutBuffer*, XrdSutBuffer**, XrdOucString&) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/ex\
ternal/xrootd/4.12.3/lib64/libXrdSecgsi-4.so)
==2385319==    by 0x8781894: XrdSecProtocolgsi::getCredentials(XrdSecBuffer*, XrdOucErrInfo*) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.1\
2.3/lib64/libXrdSecgsi-4.so)
==2385319==    by 0x40F160F: XrdCl::XRootDTransport::GetCredentials(XrdSecBuffer*&, XrdCl::HandShakeData*, XrdCl::XRootDChannelInfo*) (in /cvmfs/cms.cern.\
ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2385319==    by 0x40F248B: XrdCl::XRootDTransport::DoAuthentication(XrdCl::HandShakeData*, XrdCl::XRootDChannelInfo*) (in /cvmfs/cms.cern.ch/slc7_amd64_\
gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2385319==

Are those intentional? If so, is there a suppression file that I could use?

The instance of xrootd I use is this one (which reports unknown version with xrootd -v):

$ which xrootd
/cvmfs/cms.cern.ch/slc7_amd64_gcc900/cms/cmssw/CMSSW_11_3_0_pre3/external/slc7_amd64_gcc900/bin/xrootd
@olifre
Copy link
Contributor

olifre commented Mar 11, 2021

In such cases, it's very useful to run valgrind with --track-origins=yes to spot the location where the uninitialized value is created.
This should help to track down the actual cause ;-).

@andreypz
Copy link
Author

I just re-run it with the following options:

valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --track-origins=yes --trace-children=yes --log-file=valgr.log --tool=memcheck --leak-check=full ./myscript.py

These are xrd-related entries that I get now:

==2396841== 150 bytes in 1 blocks are definitely lost in loss record 27,062 of 41,982
==2396841==    at 0x402D753: malloc (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/valgrind/3.15.0-bcolbf2/lib/valgrind/vgpreload_memcheck-amd64-linux.\
so)
==2396841==    by 0x55E1010: __libc_alloc_buffer_allocate (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x56670CE: __resolv_conf_allocate (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x56650EC: __resolv_conf_load (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x5666987: __resolv_conf_get_current (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x5665391: __res_vinit (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x56664AA: maybe_init (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x566661D: __resolv_context_get (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x566F81B: gethostbyname2_r@@GLIBC_2.2.5 (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x563CCB5: gaih_inet.constprop.8 (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x563D5F3: getaddrinfo (in /usr/lib64/libc-2.17.so)
==2396841==    by 0x2C98380E: XrdNetUtils::GetAddrs(char const*, XrdNetAddr**, int&, XrdNetUtils::AddrOpts, int) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/\
external/xrootd/4.12.3/lib64/libXrdUtils.so.2.0.0)

and

==2396841== 712 (80 direct, 632 indirect) bytes in 1 blocks are definitely lost in loss record 34,343 of 41,982
==2396841==    at 0x402DDB2: operator new(unsigned long) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/valgrind/3.15.0-bcolbf2/lib/valgrind/vgpreload_\
memcheck-amd64-linux.so)
==2396841==    by 0x2CC923BC: XrdCryptosslFactory::Cipher(int, char*, int, char const*) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/li\
b64/libXrdCryptossl-4.so)
==2396841==    by 0x2CC36962: XrdSecProtocolgsi::ParseCrypto(XrdOucString) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdSec\
gsi-4.so)
==2396841==    by 0x2CC44DFD: XrdSecProtocolgsi::ClientDoInit(XrdSutBuffer*, XrdSutBuffer**, XrdOucString&) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/exter\
nal/xrootd/4.12.3/lib64/libXrdSecgsi-4.so)
==2396841==    by 0x2CC454A4: XrdSecProtocolgsi::ParseClientInput(XrdSutBuffer*, XrdSutBuffer**, XrdOucString&) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/e\
xternal/xrootd/4.12.3/lib64/libXrdSecgsi-4.so)
==2396841==    by 0x2CC45894: XrdSecProtocolgsi::getCredentials(XrdSecBuffer*, XrdOucErrInfo*) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.\
12.3/lib64/libXrdSecgsi-4.so)
==2396841==    by 0x2CAD260F: XrdCl::XRootDTransport::GetCredentials(XrdSecBuffer*&, XrdCl::HandShakeData*, XrdCl::XRootDChannelInfo*) (in /cvmfs/cms.cern\
.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CAD348B: XrdCl::XRootDTransport::DoAuthentication(XrdCl::HandShakeData*, XrdCl::XRootDChannelInfo*) (in /cvmfs/cms.cern.ch/slc7_amd64\
_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CAD3FF5: XrdCl::XRootDTransport::HandShakeMain(XrdCl::HandShakeData*, XrdCl::AnyObject&) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/ext\
ernal/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CAD40CA: XrdCl::XRootDTransport::HandShake(XrdCl::HandShakeData*, XrdCl::AnyObject&) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/externa\
l/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CB333E5: XrdCl::AsyncSocketHandler::OnReadWhileHandshaking() (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/li\
bXrdCl.so.2.0.0)
==2396841==    by 0x2CB3376C: XrdCl::AsyncSocketHandler::Event(unsigned char, XrdCl::Socket*) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.1\
2.3/lib64/libXrdCl.so.2.0.0)
==2396841==

and

==2396841== 832 bytes in 1 blocks are possibly lost in loss record 35,122 of 41,982
==2396841==    at 0x402F9A2: calloc (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/valgrind/3.15.0-bcolbf2/lib/valgrind/vgpreload_memcheck-amd64-linux.\
so)
==2396841==    by 0x4012784: _dl_allocate_tls (in /usr/lib64/ld-2.17.so)
==2396841==    by 0x4C3987B: pthread_create@@GLIBC_2.2.5 (in /usr/lib64/libpthread-2.17.so)
==2396841==    by 0x2CAD9358: XrdCl::TaskManager::Start() (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CAC7E58: XrdCl::PostMaster::Start() (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CAB5FB7: XrdCl::DefaultEnv::GetPostMaster() (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2CB5ABE6: XrdCl::LocalFileHandler::LocalFileHandler() (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.s\
o.2.0.0)
==2396841==    by 0x2CB0343B: XrdCl::FileStateHandler::FileStateHandler() (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.s\
o.2.0.0)
==2396841==    by 0x2CAFD895: XrdCl::File::File(bool) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/external/xrootd/4.12.3/lib64/libXrdCl.so.2.0.0)
==2396841==    by 0x2C91BAA4: TNetXNGFile::TNetXNGFile(char const*, char const*, char const*, char const*, int, int, bool) (in /cvmfs/cms.cern.ch/slc7_amd\
64_gcc900/lcg/root/6.22.06-ljfedo2/lib/libNetxNG.so)
==2396841==    by 0x2C91C60B: TNetXNGFile::TNetXNGFile(char const*, char const*, char const*, int, int, bool) (in /cvmfs/cms.cern.ch/slc7_amd64_gcc900/lcg\
/root/6.22.06-ljfedo2/lib/libNetxNG.so)
==2396841==    by 0x2CC0F612: ???
==2396841==

I also uploaded the full log here: https://cernbox.cern.ch/index.php/s/ztbGo5drEuuOuNA

@olifre
Copy link
Contributor

olifre commented Mar 11, 2021

Sadly, the initial conditional jump is missing from the new full log.
Also, that seems to be an older XRootD 4 version, so some of these may already have been addressed.

The first one likely comes from here, or an earlier variant of that code:

int rc = getaddrinfo(aInfo.ipAddr, 0, &aInfo.hints, &rP);
if (rc || !rP)
{if (rP) freeaddrinfo(rP);
return (rc ? gai_strerror(rc) : "host not found");
}

but I do not detect the logic error there...

I think the last "possibly lost" is not a real issue, it is happening inside ld by library loading.

But I'll leave a better diagnosis to the XRootD devs :-).

@abh3
Copy link
Member

abh3 commented Mar 11, 2021 via email

@abh3
Copy link
Member

abh3 commented Jul 29, 2021

I think it is safe to close this issue.

@abh3 abh3 closed this as completed Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants