-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault in XrdXrootdMonFile::DoXFR #618
Comments
Could you also post the logfile with about 10 minutes of context right before the crash? Hopefully, it won’t be huge. If it is, just send it to me directly – Andy
From: jthiltges
Sent: Thursday, November 09, 2017 2:17 PM
To: xrootd/xrootd
Cc: Subscribed
Subject: [xrootd/xrootd] segfault in XrdXrootdMonFile::DoXFR (#618)
Dear XRootD developers,
In testing the OSG build of xrootd v4.7.1 on our StashCache server at Nebraska, we encountered a segfault in XrdXrootdMonFile::DoXFR(). @bbockelm suggested sending the backtrace upstream:
Core was generated by `/usr/bin/xrootd -l /var/log/xrootd/xrootd.log -c /etc/xrootd/xrootd-stashcache-'.
Program terminated with signal 11, Segmentation fault.
#0 XrdXrootdMonFile::DoXFR () at /usr/src/debug/xrootd-4.7.1/src/XrdXrootd/XrdXrootdMonFile.cc:271
271 {if (fsP->xfrXeq) DoXFR(fsP);
Missing separate debuginfos, ...
(gdb) bt
#0 XrdXrootdMonFile::DoXFR () at /usr/src/debug/xrootd-4.7.1/src/XrdXrootd/XrdXrootdMonFile.cc:271
#1 0x00007f09b61d03a5 in XrdXrootdMonFile::DoIt (this=0x1983fa0) at /usr/src/debug/xrootd-4.7.1/src/XrdXrootd/XrdXrootdMonFile.cc:230
#2 0x00007f09b5f5ecff in XrdScheduler::Run (this=0x610e98 <XrdMain::Config+440>) at /usr/src/debug/xrootd-4.7.1/src/Xrd/XrdScheduler.cc:357
#3 0x00007f09b5f5ee49 in XrdStartWorking (carg=<optimized out>) at /usr/src/debug/xrootd-4.7.1/src/Xrd/XrdScheduler.cc:87
#4 0x00007f09b5f1b4d7 in XrdSysThread_Xeq (myargs=0x7f0834513270) at /usr/src/debug/xrootd-4.7.1/src/XrdSys/XrdSysPthread.cc:86
#5 0x00007f09b5ad7e25 in start_thread (arg=0x7f0412ded700) at pthread_create.c:308
#6 0x00007f09b4ddd34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
And here's a bit more info on the loop it's running
(gdb) # fsP doesn't point to allocated memory
(gdb) print fsP
$1 = (XrdXrootdFileStats *) 0x7f036c700d98
(gdb) print *fsP
Cannot access memory at address 0x7f036c700d98
(gdb) # Loop variables
(gdb) print i
$2 = 124
(gdb) print n
$3 = 232
(gdb) # Bad fMap entry and adjacent good entries
(gdb) print XrdXrootdMonFile::fmMap[i].fMap[n-2]
$4 = {{cVal = 139672740028968, cPtr = 0x7f08180de228, vPtr = 0x7f08180de228}}
(gdb) print XrdXrootdMonFile::fmMap[i].fMap[n-1]
$5 = {{cVal = 139652680912280, cPtr = 0x7f036c700d98, vPtr = 0x7f036c700d98}}
(gdb) print XrdXrootdMonFile::fmMap[i].fMap[n]
$6 = {{cVal = 139675760573464, cPtr = 0x7f08cc17bc18, vPtr = 0x7f08cc17bc18}}
(gdb) # Bad fMap.vPtr entry and adjacent good entries
(gdb) print *(XrdXrootdMonFile::fmMap[i].fMap[n-2].vPtr)
$7 = {FileID = 1276444928, MonEnt = -1818, monLvl = 2 '\002', xfrXeq = 0 '\000', fSize = 180785925, xfr = {read = 12931845, readv = 0, write = 0}, ops = {read = 13, readv = 0, write = 0, rsMin = 32767, rsMax = 0, rsegs = 0, rdMin = 348933, rdMax = 1048576,
rvMin = 2147483647, rvMax = 0, wrMin = 2147483647, wrMax = 0}, ssq = {read = 0, readv = 0, rsegs = 0, write = 0}}
(gdb) print *(XrdXrootdMonFile::fmMap[i].fMap[n-1].vPtr)
Cannot access memory at address 0x7f036c700d98
(gdb) print *(XrdXrootdMonFile::fmMap[i].fMap[n].vPtr)
$8 = {FileID = 1309999360, MonEnt = -1816, monLvl = 2 '\002', xfrXeq = 0 '\000', fSize = 178914241, xfr = {read = 33570816, readv = 0, write = 0}, ops = {read = 33, readv = 0, write = 0, rsMin = 32767, rsMax = 0, rsegs = 0, rdMin = 16384, rdMax = 1048576,
rvMin = 2147483647, rvMax = 0, wrMin = 2147483647, wrMax = 0}, ssq = {read = 0, readv = 0, rsegs = 0, write = 0}}
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks for the logfile. I don’t see anything here which would cause that are of memory being unmapped. I say that because the pointer value, even though unusable, does not look unusual at all. Is this the first time this has happened?
Andy
From: jthiltges
Sent: Thursday, November 09, 2017 3:04 PM
To: xrootd/xrootd
Cc: Andrew Hanushevsky ; Comment
Subject: Re: [xrootd/xrootd] segfault in XrdXrootdMonFile::DoXFR (#618)
Logs (31MB): https://t2.unl.edu/jthiltge/hcc-stash-xrootd-618.txt
Config: https://t2.unl.edu/jthiltge/hcc-stash-xrootd-618.cfg
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I believe this is the first time it's happened, yes. Checked back through the logs and I don't see any similar segfaults. Since we hadn't seen it with v4.7.0 or previous, I think the worry was if this was a new issue introduced in v4.7.1. If that's unlikely in your opinion, I'm happy to ignore it for now and pursue things further if we see more errors. Regards, |
Hi John,
Well, clearly it shouldn't have happened. We didn't touch anything in the
monitoring section in 4.7.1 as far as I can tell. So, let's see if this
happens again. It would be nice if I could see the core file but that
would require you giving me an account (we've rarely been successful
importing a core file).
Andy
…On Fri, 10 Nov 2017, jthiltges wrote:
I believe this is the first time it's happened, yes. Checked back through the logs and I don't see any similar segfaults.
Since we hadn't seen it with v4.7.0 or previous, I think the worry was if this was a new issue introduced in v4.7.1. If that's unlikely in your opinion, I'm happy to ignore it for now and pursue things further if we see more errors.
Regards,
John
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#618 (comment)
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
I sincerely appreciate your help. I setup a VM with matching software and debuginfo packages, along with the coredump, and emailed the credentials directly. |
Dear XRootD developers,
In testing the OSG build of xrootd v4.7.1 on our StashCache server at Nebraska, we encountered a segfault in XrdXrootdMonFile::DoXFR(). @bbockelm suggested sending the backtrace upstream:
And here's a bit more info on the loop it's running
The text was updated successfully, but these errors were encountered: