You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are experiencing segfault and kernel traps issues when running xrdcopy. This problem was observed two weeks ago and we allowed core dumps to be saved immediately. The problem has appeared again tonight, occurring 7 times between 00:30 and 5:35 (CEST). For these seven failures, a core dump has been saved:
$ ls -ltc core.*
-rw------- 1 dcache-mon users 169820160 May 17 05:32 core.13221
-rw------- 1 dcache-mon users 169877504 May 17 05:17 core.10943
-rw------- 1 dcache-mon users 102694912 May 17 05:02 core.8971
-rw------- 1 dcache-mon users 136257536 May 17 02:32 core.20364
-rw------- 1 dcache-mon users 136257536 May 17 02:17 core.18044
-rw------- 1 dcache-mon users 119476224 May 17 02:02 core.16144
-rw------- 1 dcache-mon users 186654720 May 17 00:32 core.3559
The logged information in /var/log/messages:
# grep kernel /var/log/messages
May 17 00:32:46 kernel: traps: xrdcopy[3596] general protection ip:7f653460acd4 sp:7f65306c4c10 error:0 in libXrdCl.so.2.0.0[7f6534545000+11e000]
May 17 02:02:31 kernel: xrdcopy[16193]: segfault at 2 ip 00007fbef46e946c sp 00007fbef0803b50 error 4 in libXrdCl.so.2.0.0[7fbef4684000+11e000]
May 17 02:32:31 kernel: xrdcopy[20399]: segfault at 2 ip 00007f645f48146c sp 00007f645b59bb50 error 4 in libXrdCl.so.2.0.0[7f645f41c000+11e000]
May 17 05:02:31 kernel: xrdcopy[8998]: segfault at 2 ip 00007f43563f746c sp 00007f4352511b50 error 4 in libXrdCl.so.2.0.0[7f4356392000+11e000]
May 17 05:17:46 kernel: xrdcopy[10986]: segfault at 2 ip 00007fea0295d46c sp 00007fe9fea77b50 error 4 in libXrdCl.so.2.0.0[7fea028f8000+11e000]
May 17 05:32:31 kernel: traps: xrdcopy[13257] general protection ip:7f3203a09cd4 sp:7f31ffac3c10 error:0 in libXrdCl.so.2.0.0[7f3203944000+11e000]
One comment: we have not been able to find anything in /var/log/messages for the core dump at 02:17
The xrdcopy command is run with debug level 3 and the lines containing error are the following:
@samuambroj : sorry for the late response, somehow I haven't notice your issue before. Do you maybe have respective stacktrace including the line numbers?
Dear Xrootd Team,
We are experiencing segfault and kernel traps issues when running xrdcopy. This problem was observed two weeks ago and we allowed core dumps to be saved immediately. The problem has appeared again tonight, occurring 7 times between 00:30 and 5:35 (CEST). For these seven failures, a core dump has been saved:
$ ls -ltc core.* -rw------- 1 dcache-mon users 169820160 May 17 05:32 core.13221 -rw------- 1 dcache-mon users 169877504 May 17 05:17 core.10943 -rw------- 1 dcache-mon users 102694912 May 17 05:02 core.8971 -rw------- 1 dcache-mon users 136257536 May 17 02:32 core.20364 -rw------- 1 dcache-mon users 136257536 May 17 02:17 core.18044 -rw------- 1 dcache-mon users 119476224 May 17 02:02 core.16144 -rw------- 1 dcache-mon users 186654720 May 17 00:32 core.3559
The logged information in
/var/log/messages
:One comment: we have not been able to find anything in
/var/log/messages
for the core dump at 02:17The xrdcopy command is run with debug level 3 and the lines containing error are the following:
The xrootd related packages installed on the client machine (note that xrootd-debuginfo was installed two weeks ago):
# rpm -qa | grep -i xroo xrootd-client-libs-4.9.1-1.el7.x86_64 gfal2-plugin-xrootd-2.16.1-1.el7.x86_64 xrootd-libs-4.9.1-1.el7.x86_64 xrootd-client-4.9.1-1.el7.x86_64 nordugrid-arc-plugins-xrootd-5.4.3-1.el7.x86_64 xrootd-debuginfo-4.9.1-1.el7.x86_64
We could send you the 7 core dumps and the detailed xrdcopy output of the last failure. The total uncompressed size is around 1GB.
Best,
Samuel
The text was updated successfully, but these errors were encountered: