New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xrdcp high CPU consuption on AlmaLinux9 #2162
Comments
This should also be an issue in Alma 8 where the problem was reported: However, the DH parameters should not be longer than 2K, so it's not clear why high CPU usage is triggered. |
There seems to exist several performance issues with openssl 3.0 Some of them seem to have been resolved in version 3.1 https://www.openssl.org/blog/blog/2023/03/07/OpenSSL3.1Release/ |
Hi. I've tried to also make some tests since Joao reported this issue. I was testing with a 2048bit voms proxy, connecting to a centos 7, v5.6.4 xrootd server. My client was on an alma9 machine. Timing the xrdcp over 5 attempts, and reporting the average time (quoting two figures, as a rough estimate): It takes 1.9s with the system openssl (3.0.7-24.el9). I rebuilt xrtood against openssl 1.02k (approximately the version that was on centos 7) and using that, the command had better, faster performance 0.15s (which is about the time I also get on an actual centos 7 machine). Using openssl 3.2.0 (last test release I think) still gave the slower 1.9s timing, and appears to include the fix to avoid testing extremely long DH parameters (openssl/openssl@9e0094e). Looking for other causes, I thought it may be a change in a prime number test that openssl uses (i.e., the iterations of a Miller-Rabin test). It's a probabilistic primality test, where raising the number if iterations reduces the chance of identifying a composite number as prime. The prime test is used when checking the DH parameters the XrdCl client receives from the server. Our server now has a fixed set of DH paramerers, with a 3072bit prime. I saw that during the DH check the prime test is called two times, on this 3072bit and a 3071bit number (p/2). The number of iterations for these tests in openssl 1.02k was 2, and now it is 128. Concerning the change of iterations: There seems to be discussion here openssl/openssl#9272, with the last comment on that ticket refering to this paper https://eprint.iacr.org/2020/065 which I think is the motivation for the change. But post of the ticket is discussing key generation, not specifically DH parameter checking. My (possibly incomplete or wrong!) take is that the lower number of iterations used in openssl 1.0.2k is based on an average case, applicable for testing numbers that a client might generate (e.g. when generating a key), whereas the larger "worst case" number is applicable for testing a number that might be "adversarially-selected". So I believe the increase is deliberate and motivated by security considerations.. However usually we know the client will usually be testing the same DH parameters (as now the server always offers the same one), and we know that one is safe. So one idea could be to check if we receive exactly this set of parameters and if so skip the check. I returned to using the alma9 standard openssl, but skip the check in the xrootd code:
(e.g. we could skip if we detect we've received our well known parameter set). Without the EVP_PKEY_param_check call the timing is 0.24s. Alternatively to completely disabling the check, there's a EVP_PKEY_param_check_quick() that checks some aspects of the parameters, but avoids the prime test(s), but I'm not sure if this would be safe to use instead of what we have now. I believe it's probably not safe, although the docs I looked at I wasn't sure about when one might use it. (e.g. https://www.openssl.org/docs/man3.0/man7/EVP_PKEY-DH.html ) Even in the case of skipping the EVP_PKEY_param_check() call it was still slower than openssl 1.02k by about 1.6 times. Most of this remaining slow down seems to be 6 other calls to the prime testing function; these are not connected to the DH parameters but are related to the proxy I was using. (I was using a 2048 bit proxy, and that means it was prime-testing 1024bit numbers, and for these the change in iterations between openssl 1.02k and 3 was different 6 to 64, I think). I don't know if we could do anything for these or not, without some impact on security. |
I'm making a PR with ideas from above, so we can discuss some possible concrete changes; tomorrow or a little later this week. |
Fixed by #2166. |
An increase in CPU consumption was observed on FTS transfers using xrootd on AlmaLinux 9 machines.
After an initial investigation with perf, it seems that most of the CPU time is spent on the openssl
EVP_PKEY_param_check
call.The xrootd and openssl versions installed in the machine are:
Do you have an idea of why this is happening?
Thanks a lot!
The text was updated successfully, but these errors were encountered: