New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gsi el7 clients fail to connect to alma9 MGM #2014
Comments
We are seeing exactly the same problem. Client Centos7(xrootd5.5.4-1) server Alma9(xrootd5.5.5-1) . update-crypto-policies --set DEFAULT:SHA1 on server and restarting has no effect. xrdcp from Alma 9 client to Alma 9 server works fine. [leech@pplxint11 ~]$ xrdcp -f -d1 xroot://pplxwn021//tmp/zap local_zap2 I'm also willing to test, as this is stalling our upgrade of all systems to Alma 9. Just a thought. Major upgrade to openssl3 on EL9. Lots of legacy stuff has been dropped. Cheers. |
This discusses some additional things you may need o do
https://computingforgeeks.com/configure-system-wide-cryptographic-policies/
However, while this should work for RH distribution. It's not at all clear
crypto policies will work in the Alma distribution; at least not without
somme additional effort.
Andy
…On Thu, 25 May 2023, mike-leech wrote:
We are seeing exactly the same problem. Client Centos7(xrootd5.5.4-1) server Alma9(xrootd5.5.5-1) .
update-crypto-policies --set DEFAULT:SHA1 on server and restarting has no effect.
xrdcp from Alma 9 client to Alma 9 server works fine.
***@***.*** ~]$ xrdcp -f -d1 xroot://pplxwn021//tmp/zap local_zap2
230525 11:37:45 255389 secgsi_ClientDoCert: could not instantiate session cipher using cipher public info from server
[2023-05-25 11:37:45.619578 +0100][Error ][XRootDTransport ] [pplxwn021:1094.0] Auth protocol handler for gsi refuses to give us more credentials Secgsi: ErrParseBuffer: could not instantiate session cipher : kXGS_cert
[2023-05-25 11:37:45.619725 +0100][Error ][AsyncSock ] [pplxwn021:1094.0] Socket error while handshaking: [FATAL] Auth failed
[2023-05-25 11:37:45.619845 +0100][Error ][PostMaster ] [pplxwn021:1094] elapsed = 1, pConnectionWindow = 120 seconds.
[2023-05-25 11:37:45.619895 +0100][Error ][PostMaster ] [pplxwn021:1094] Unable to recover: [FATAL] Auth failed.
[2023-05-25 11:37:45.619932 +0100][Error ][XRootD ] [pplxwn021:1094] Impossible to send message kXR_open (file: /tmp/zap, mode: 00, flags: kXR_open_read kXR_async kXR_retstat ). Trying to recover.
[0B/0B][100%][==================================================][0B/s]
Run: [FATAL] Auth failed: Secgsi: ErrParseBuffer: could not instantiate session cipher : kXGS_cert (source)
I'm also willing to test, as this is stalling our upgrade of all systems to Alma 9.
Just a thought. Major upgrade to openssl3 on EL9. Lots of legacy stuff has been dropped.
Cheers.
--
Reply to this email directly or view it on GitHub:
#2014 (comment)
You are receiving this because you are subscribed to this thread.
Message ID: ***@***.***>
|
@abh3 Alma and RH are the same distro technically and as i said i have
so all possible lowest requirements L.E. my bad, this i did not mentioned in this ticket, only in private chat |
What about doing the reverse, that is, setting the policy to "FUTURE" on the client side? |
centos7 does not have update-crypto-policies nor crypto-policies.. |
I've also tried enabling the openssl legacy provider directly on the server. Still no luck I'm afraid. [root@p ~]# update-crypto-policies --show |
Well, technically sort of. Alma is not based on any release that Redhat
creates. So, it can be, in some repsects, very different. The whole
crypto scheme was also invented by Redhat and I am skeptical that Alma
faithfully duplicates it.
That said, there are several issues with the "future" x509 requirements
with respect to backward compatability. SHA1 is only one of them. To
capture all the conflict yous should be setting the default to "legacy"
which (at least in Redhat) adjust all the conflicts. So use
update-crypto-policies --set LEGACY
Now whether that works in Alma is a different question.
|
Hi @abh3 so, Alma is a Redhat clone as Centos was for EL7 .. also, see their statement:
|
Hi, we're seeing the same issue at Durham. We have C7, R9 gateways running xroot-1:5.5.4 and receive cipher issues between C7 to R9 communication even with legacy ciphers turned on. We're finding R8 to be a good middle man between the two but it means we're running a third gateway to permit cross communication. |
Why are we looking at SHA-1 stuff? The error message refers to the failure of a session cipher. In the case of a RHEL7, it is exchanged via 512 bit DH, no? |
i just tried with a pmod with this content:
applied and rebooted and still i get (from a centos 7 client):
|
Hi Adrian, For crypto policies, I believe the minimum Diffie Hellman is 2048 (unless Thanks, Brian |
Hi @bbockelm erm, why would i increase something when the need is to lower everything down until the EL7 client works? the target is an Alma9 EOS MGM and the client is a Centos7 (and this is actually the problem, as a fedora 38 client works without problem) |
@adriansev - I'm not suggesting it's a solution, just want to test the theory that this is where the problem is. It would be very useful to understand it's indeed in the DH settings and not elsewhere in the code. (Memory is very hazy of my last read of this code but I believe the DH key itself is later on truncated so there's only 512 bits of security even if the buffers fed to OpenSSL are 2048 bit.... i.e., there's no effective security here unless you're using |
@bbockelm yeah, i did not get it but make sense. so i did as you suggested, rebooted the machine and the error is the same. for reference the overall current policy looks like this: https://asevcenc.web.cern.ch/asevcenc/eos_config_auger/new_pol |
Let's see here, the following RedHat article shows how you can display
what the requirements are for each crypto policy using gnutls-utils and
httpd/ Here we are interested in what the requirement are when you execute
update-crypto-policies --set LEGACY
Follow the server recipe in the article using alma 9. Then we can see
where EL7 is diverging from what alma 9 wants.
See: https://access.redhat.com/articles/3666211
According to RH, LEGACY should ensures maximum compatibility with Red
Hat Enterprise Linux 5 and earlier; it is less secure due to an
increased attack surface. In addition to the DEFAULT level algorithms and
protocols, it includes support for the TLS 1.0 and 1.1 protocols. The
algorithms DSA, 3DES, and RC4 are allowed, while RSA keys and
Diffie-Hellman parameters are accepted if they are at least 1023 bits
long.
This may not be the case in Alma 9 so let's find out.
Andy
…On Wed, 31 May 2023, Adrian Sevcenco wrote:
@bbockelm yeah, i did not get it but make sense. so i did as you suggested, rebooted the machine and the error is the same. for reference the overall current policy looks like this: https://asevcenc.web.cern.ch/asevcenc/eos_config_auger/new_pol
--
Reply to this email directly or view it on GitHub:
#2014 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
Hi @abh3 see the link posted above: https://asevcenc.web.cern.ch/asevcenc/eos_config_auger/new_pol Is built with SHA1 on top of LEGACY after which i added this pmod (well the dh is now 2048):
the end result the is policy dump reference above |
Could you export XrdSecDEBUG=1 and try connecting again and post the log output. It may tell us what the server really expects. |
Oh never mind. I see you did that already.
…On Wed, 31 May 2023, Andrew Hanushevsky wrote:
Could you
export XrdSecDEBUG=1
and try connecting again and post the log output. It may tell us what the server really expects.
--
Reply to this email directly or view it on GitHub:
#2014 (comment)
You are receiving this because you are subscribed to this thread.
Message ID: ***@***.***>
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
Ah, could you post the OpenSSL versions of the centos7, fedora38, and Allma9 machines? |
EL7: 1.0.2k |
OK, so it might be a compatability issue between 1.0.2 and 3.0.7. I did
read that OpenSSL does notclaim complete compatibility between the two
releases. It might but no gaurntees. Another thing to track down. It
should be with 1.1.1 but there appears to be skepticism even for that
combination.
…On Thu, 1 Jun 2023, Adrian Sevcenco wrote:
EL7: 1.0.2k
Alma9: 3.0.7
Fedora38: 3.0.8
--
Reply to this email directly or view it on GitHub:
#2014 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
Just to add, it works when connecting from Rocky8 client (openssl v1.1.1k). |
Oh, yes, the majority talk here is about API compatability which I know is
problematic. What we are talking about here is wire compatability. That is
also questionable given the large changes in 3.0. Ine could convince
oneself that this is the key issue since it works with fedora38 which is
in the same release series but fails for 1.0.2. Any chance you would try
the same test with 1.1.1?
…On Thu, 1 Jun 2023, Adrian Sevcenco wrote:
EL7: 1.0.2k
Alma9: 3.0.7
Fedora38: 3.0.8
--
Reply to this email directly or view it on GitHub:
#2014 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
Thanks for that observation. So, it might be that 1.0.1 is simply not
compatible with 3.0. OK, so can you upbgrade your EL7 machine to 1.1.1 and
change nothing else and see if it works?
…On Thu, 1 Jun 2023, VipulDavda wrote:
Just to add, it works when connecting from Rocky8 client (openssl v1.1.1k).
--
Reply to this email directly or view it on GitHub:
#2014 (comment)
You are receiving this because you commented.
Message ID: ***@***.***>
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
BTW, to try to make it work, we even changed the openssl config on Alma9 - https://www.practicalnetworking.net/practical-tls/openssl-3-and-legacy-providers/ |
Re: upgrade EL7 to use openssl 1.1.1 I'm new to xrootd but don't you have to compile xrootd to use openssl 1.1.1 on EL7? ldd /usr/bin/xrdcp |
Actually i do have openssl11 installed from epel:
but the epel provided xrootd is build with system openssl not with epel openssl @amadio is there a possibility of a test rpm for EL7 that requires and use openssl11? If this works, then the next EL7 xrootd release should work but the gfal packager for epel should also be contacted to repackage gfal on the new dependencies. |
Sure, I will create RPMs using OpenSSL 1.1 from epel for testing. I will add a comment here when they are ready. |
I think I figured it out. The problem is this one: Basically, in 2019 OpenSSL overhauled it's DH parameter generation code which resulted in it generating new DH parameters sent by the server that older clients did not like. It appears the more lenient client-side check was kept but eventually the server-side change was reverted during 1.1.1 -- but based on some GDB footwork, it's back in 3.0.0. Now, options:
I think (2) is the more viable option; hardcoding a known good group is a fairly common solution (see https://wiki.openssl.org/index.php/Diffie-Hellman_parameters). Unfortunately, XRootD's 512-bit DH is weak enough to not be considered secure by the 1990's; therefore, there's no standardized 512-bit DH group that we can easily reuse. Instead, I'd just suggest generating any old one by hand and hardcode that. Here's an example:
Loading that on the server side would replace the generation code: https://github.com/xrootd/xrootd/blob/master/src/XrdCrypto/XrdCryptosslCipher.cc#L507-L518 For other sizes of DH parameters, one could simply do a lookup table. RFC 3526 covers examples up through 4096. |
Fantastic news that you found the problem!!! but it seems to me that there will not be a fast resolution to the problem and i am pressed by the beneficiary of this EOS installation to be put in production as soon as possible, so i will have to reinstall it with Centos 7 and let the "big guys" handle this problem. Thanks a lot!! |
I have put CentOS 7 RPMs linking against OpenSSL 1.1 from EPEL here: The file with the repository configuration is here: This repository is temporary, just for testing, it may be removed after this is tested. |
@adriansev Could you please try to connect with the client above? For convenience (copy/paste into your terminal):
Cheers, |
@amadio - could you test #2026? I did some initial testing on my side and it restores the compatibility between RHEL7 clients and RHEL9. Unfortunately, I'm not sure a build against OpenSSL 1.1.1 is necessary anymore. Because the issue is traced to the client - and we can't change all existing clients in one fell swoop - the fix must be server side. |
I've tested the client linked against OpenSSL 1.1 and it works against the unpatched server. I also tested a server with this patch on Alma 9 with a client on lxplus7 and it also works. |
OK, I merged this. However, could you rerun your test on the latest merge as @bbockelm made some last minute changes. I don't see how they will affect the test but you never know. |
@amadio I apologize for the silence, returning from holiday was a little bit busy. |
Hi @adriansev - What you post appears to be a different problem:
The "digest" object here is used to sign messages. Looking at the code, it defaults to
(ref: https://xrootd.slac.stanford.edu/doc/dev56/sec_config.htm) If that doesn't solve it, let's file a separate ticket as it's a distinct issue. Brian |
I have tested the current master branch (i.e. after this has been merged) on the same Alma9 machine, and I can connect with a client on CentOS 7 to it without problems. |
@bbockelm oh, yes, sorry for the noise, as soon as i put |
It seems that there is a client-side openssl problem when the connecting client is on Centos7 and the server is Alma9 (in my case an EOS Alma9 MGM).
The same connection with the same client (5.5.5 at this moment) from a Fedora38 works without problem.
The gsi debug output for both cases can be inspected here: https://asevcenc.web.cern.ch/asevcenc/gsi_dump/
Looking at the code it seems that the problem at this point:
https://github.com/xrootd/xrootd/blob/master/src/XrdSecgsi/XrdSecProtocolgsi.cc#L3320
(so everything else up to this point is ok)
but i'm not sure is the implementation of Cipher method is this one:
https://github.com/xrootd/xrootd/blob/master/src/XrdCrypto/XrdCryptosslCipher.cc#L241
and what could be the problem.
Let me know if i can enable tracing options and provide more logging.
Thanks a lot!
The text was updated successfully, but these errors were encountered: