-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear the OpenSSL error queue in XrdCryptosslX509ParseBucket() #1465
Conversation
…e(), and SSL_connect() for client connections" This reverts commit 9d355f6.
I thought the errors we are seeing now happen on UNL gridftps. But I can sure pull this in and run it on UCSD caches. |
Well, yes. I am also confused about this patch. On the client side,
error_clear_err() is called after authentication so we know that the error
queue, if one was left over from gsi, is gone. On the server-side we now
do the same. Adding additional calls within gsi is a no-win situation as
there are so many places gsi can generate an error without clearing it the
mind boggles and it's likely all of them have not been caught. That's why
treating gsi as a black box and doing an external clear is a better
option.
Now that said, I think the other server-side issue may be that https also
uses parts of gsi as well as other openSSL routines that may leave junk
in the error queue. Finding all the places is also a hopeless case (at
least for now).
There is one additional fix I can try and that is when a thread is
returned to the thread pool the pool manager simply clear the error queue
before the thread is allowed to be reused. That is a sledge hammer
approach but we might as well try it as it has the least impact and pretty
well puts a nail in this coffin.
Shall I?
Andy
…On Tue, 8 Jun 2021, Matev? Tadel wrote:
I thought the errors we are seeing now happen on UNL gridftps. But I can sure pull this in and run it on UCSD caches.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#1465 (comment)
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
My goal was to get rid of the |
Thanks JT. If it's not we go with plan B and try to clear them in the
server at thread return time. This is getting to be a bear, sigh.
…On Tue, 8 Jun 2021, jthiltges wrote:
My goal was to get rid of the `ERR_clear_error()` calls in `XrdTlsSocket`. If the error queue clearing in f74d453 is sufficient, then this PR can be ignored. I'll do some testing.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#1465 (comment)
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
Andy, remember, John is also seeing these errors in XCache when auth is enabled. I don't know if http access is enabled there as well. |
Hi Matevz,
Yes, the unfortunate part is that https is enabled and is happily uusing
gsi as well. JT is going through the tedious steps of eliminating each
issue as we add additional stuff as we need. He is a good trooper!
Andy
…On Tue, 8 Jun 2021, Matev? Tadel wrote:
Andy, remember, John is also seeing these errors in XCache when auth is enabled. I don't know if http access is enabled there as well.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#1465 (comment)
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
I did some testing with master (bbf477b) and reverting the XrdTlsSocket ERR_clear_error() patch (9d355f6). I'm still seeing I'm suspecting that xrootd/src/XrdCl/XrdClXRootDTransport.cc Lines 2314 to 2320 in bbf477b
It turns out that while our servers have TLS configured (serverFlags = 3592421377), our local redirector does not (serverFlags = 3145730).
That would explain why this PR seemed to address the issue, since it puts the SSL error clearing further down. |
@jthiltges : ahh, ok, I see your point, the error queue is cleared only if the channel is encrypted, however since multiple channels share the same event-loop thread it still may interfere with another channel that is encrypted Could you try removing the if statement: xrootd/src/XrdCl/XrdClXRootDTransport.cc Lines 2314 to 2315 in bbf477b
and see if this helps? |
…eBucket()" This reverts commit 1710939.
XrdSecProtocolgsi::getCredentials() will leave OpenSSL errors on the queue, even with non-encrypted channels. Clean up the queue to avoid affecting other TLS traffic.
I cannot reproduce the TLS errors with this revised PR. |
Thanks JT for all the work! When we meet in person again, I'll buy you a couple of beers :-) |
Looks solid :-) Thanks a lot, great job! |
Regarding the OpenSSL
ERR_clear_error()
overhead introduced in #1464:This PR should clear the OpenSSL error queue in
XrdCryptosslX509ParseBucket()
, along with reverting theERR_clear_error()
calls inXrdTlsSocket
.So far, I'm unable to reproduce the TLS error on our local xcache with this patch (on the latest experimental, xrootd-5.2.1-0.experimental.2679300.76d03f61). This certainly could use some thorough testing to confirm. @osschar