-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xrootd crashed (as manager) when using f-stream monitoring? #155
Comments
Hi Tommaso, You are right, while the redirector doesn't need that directive neither Andy On Wed, 5 Nov 2014, Tommaso Boccali wrote:
|
ciao Andy! /afs/pi.infn.it/user/boccali/public/core.17299 you should be able to access it from lxplus. I did some more checks:
and GDB:
ciao ciao tom |
Hi Tommaso, What version of Linux ar you running? Andy On Wed, 5 Nov 2014, Tommaso Boccali wrote:
|
[root@xrootd-redic xrootd]# uname -a [root@xrootd-redic xrootd]# cat /etc/issue |
Hi Tommaso, That particular strace probably isn't relevant. I'm having difficulties Andy On Wed, 5 Nov 2014, Tommaso Boccali wrote:
|
ok, huge output incoming ....
|
Hi Tommaso, OK, you will need to install the debug symbol RPM so I can get line Andy P.S. After installing debug symbols please just to a "where" on the thread On Wed, 5 Nov 2014, Tommaso Boccali wrote:
|
ciao, so here it is:
and the 'where' says
is this what you need? not sure how to find 'the thread that got the SEGV'. tom |
Hi Tomasso, Too bad you are leaving tonight (weren't you supposed to talk tomorrow?). Andy On Wed, 5 Nov 2014, Tommaso Boccali wrote:
|
Got it! Yes, tomorrow after 11 AM would be fine. Andy On Wed, 5 Nov 2014, Tommaso Boccali wrote:
|
so, thread #1 is the crashing one, so the backtrace is exactly what i sent you before ...
so
|
exec summary. I sit with andy and looked in the problem. It seems we have hit an xrootd3 bug, already solved in xrootd4. can be closed (sorry I do not know how ;) Thanks a lot! tom |
Ciao, as instructed by CMS, we changed on all our nodes the monitoring endpoint from
xrootd.monitor all auth flush 30s mbuff 1472 window 5s dest files io info redir user xrootd.t2.ucsd.edu:9930
to
xrootd.monitor all fstat 60s lfn ops ssq xfr 5 ident 5m dest fstat info user CMS-AAA-EU-COLLECTOR.cern.ch:9330
after than, a few of us noticed that both in redirectors and in site-redirectors we get a xrootd crash minutes after the start (we understand that configuration is probably unneeded on a redir, but many of us still have it for config consistency).
no output from the crash, just lines like
Nov 5 09:30:09 xrootd-redic kernel: [17619227.751667] xrootd[8524]: segfault at 8 ip 0000000000417743 sp 00007f2089c06cc0 error 6 in xrootd[400000+30000]
in the /var/log/messages
It does not seem to be a collector problem, indeed if we put
xrootd.monitor all auth flush 30s mbuff 1472 window 5s dest files io info redir user CMS-AAA-EU-COLLECTOR.cern.ch:9330
all is ok
is it a t-stream vs f-stream problem?
my xrootd version is
[root@xrootd-redic xrootd]# rpm -qa|grep xrootd
xrootd-client-libs-3.3.6-1.slc6.x86_64
xrootd-3.3.6-1.slc6.x86_64
xrootd-client-3.3.6-1.slc6.x86_64
xrootd-libs-3.3.6-1.slc6.x86_64
xrootd-server-libs-3.3.6-1.slc6.x86_64
(not sure about the other cases...)
and the config says
all.role manager
thanks!
tom
The text was updated successfully, but these errors were encountered: