Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault for gfal-copy with gsiftp on RHEL-8 #6

Closed
XMol opened this issue Dec 18, 2023 · 4 comments
Closed

Segmentation fault for gfal-copy with gsiftp on RHEL-8 #6

XMol opened this issue Dec 18, 2023 · 4 comments

Comments

@XMol
Copy link

XMol commented Dec 18, 2023

Hello GFAL developers,

we've installed a new VM with RHEL-8 to run our regular transfer tests using the gfal2-tools. They used to run just fine on SL-7, but now our gfal-copy commands randomly fail with Segmentation faults exclusively with the gsiftp transfer protocol (see attached example log).

So far, no core dumps were produced - we're trying to adjust system limits and kernel parameters to enable that. Other relevant information you might need...

$ python3 --version
Python 3.6.8

$ rpm -qa gfal2*
gfal2-plugin-file-2.21.5-1.el8.x86_64
gfal2-plugin-gridftp-2.21.5-1.el8.x86_64
gfal2-2.21.5-1.el8.x86_64
gfal2-plugin-http-2.21.5-1.el8.x86_64
gfal2-plugin-srm-2.21.5-1.el8.x86_64
gfal2-plugin-xrootd-2.21.5-1.el8.x86_64
gfal2-util-scripts-1.8.0-1.el8.noarch

$ uname -a
Linux gm-1-kit-e 4.18.0-513.5.1.el8_9.x86_64 #1 SMP Fri Sep 29 05:21:10 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

Do you have other hints on what we should check to find out more information for you?

Best regards,
Xavier.

@XMol
Copy link
Author

XMol commented Dec 20, 2023

Hello again,

so, we have adjusted system limits such that core dumps can be produced on principle, but not for galf-copy/python3...

$ grep systemd-coredump /var/log/messages | tail
Dec 19 23:16:02 gm-1-kit-e systemd-coredump[3646763]: Process 3646718 (python3) of user 1000 dumped core.
Dec 19 23:16:02 gm-1-kit-e systemd[1]: systemd-coredump@258-3646762-0.service: Succeeded.
Dec 20 01:16:02 gm-1-kit-e systemd-coredump[3658820]: Resource limits disable core dumping for process 3658734 (python3).
Dec 20 01:16:02 gm-1-kit-e systemd-coredump[3658820]: Process 3658734 (python3) of user 1000 dumped core.
Dec 20 01:16:02 gm-1-kit-e systemd[1]: systemd-coredump@259-3658819-0.service: Succeeded.
Dec 20 08:31:02 gm-1-kit-e systemd-coredump[3702089]: Resource limits disable core dumping for process 3702066 (python3).
Dec 20 08:31:02 gm-1-kit-e systemd-coredump[3702089]: Process 3702066 (python3) of user 1000 dumped core.
Dec 20 08:31:02 gm-1-kit-e systemd[1]: systemd-coredump@260-3702087-0.service: Succeeded.
Dec 20 08:39:36 gm-1-kit-e systemd-coredump[3703091]: Process 3703089 (sleep) of user 1000 dumped core.#012#012Stack trace of thread 3703089:#012#0  0x00007fe5fe35d8e8 __nanosleep (libc.so.6)#012#1  0x0000564612d25b47 rpl_nanosleep (sleep)#012#2  0x0000564612d25920 xnanosleep (sleep)#012#3  0x0000564612d22a88 main (sleep)#012#4  0x00007fe5fe29ed85 __libc_start_main (libc.so.6)#012#5  0x0000564612d22b5e _start (sleep)
Dec 20 08:39:36 gm-1-kit-e systemd[1]: systemd-coredump@261-3703090-0.service: Succeeded.

Do you maybe have an idea what else to check?

Best regards,
Xavier.

@XMol
Copy link
Author

XMol commented Dec 20, 2023

Considering that the issues come up exclusively for gridftp, I move the issue over to cern-fts/gfal2.

@mpatrascoiu
Copy link
Contributor

Hello Xavier,

The gfal2 repository is more suited, but given this affects the GridFTP protocol (and on RHEL8), we won't be able to look into it.

The WLCG plan is to decommission GridFTP altogether, and Gfal2 will follow that.
We are also migrating to Alma9 as the future supported platform starting with June 2024.

Just saying to don't keep your hopes up on this.

Cheers,
Mihai

@XMol
Copy link
Author

XMol commented Dec 20, 2023

Thanks for the comment, Mihai. I, too, was hoping that the errors wouldn't be frequent enough to trigger alarms, but alas, they are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants