-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sudden, significant slowdown (7 - 10x slower) when running tests via Docker image as of Apr 1 #2354
Comments
Thx. Can you make our lives easier and at least mention the version of testssl.sh? |
Sorry @drwetter, since it's running in Docker from your official image, I thought the SHAs would be enough for you.
|
FYI: There was a bigger change 2 weeks ago. But that doesn't correlate with your report and in the first place it was faster. I somehow doubt @polarathene did a lot good things to the Dockerfile here and he's using it Docker images for his project. I suspect when something's fishy he'd speak up ;-) Other than that: There were only two minor changes the past days, see closed PRs #2353 and #2352 which should have no to little impact. |
OK - sorry to have created noise. The only other common element in the infrastructure (which ranges across different countries, hosting accounts, customers and obviously SSL endpoints) than the same Docker SHAs is me, so maybe it's me :) I'll close it out, no point if it's only me experiencing it. |
Potentially minor regressionMarch 31st was our last CI run for the image test, so will need to wait until the next for a comparison:
That step took 5 minutes which is fairly normal. Although if I go back to a run before the Before switching to the internal version of openssl package from Alpine images via
Each of those tests is against 7 ports for non-HTTPS services, against each mix of supported cipher-suites that our Unable to reproduce - New image performs betterRunning a single site test from my own ageing system locally: # SHA256 digest: 032002467a862feec3e329720af1c023ec6714d1d98a63e7b030945de656c4ab
$ docker run --rm -it drwetter/testssl.sh:3.1dev github.com
Done 2023-04-03 21:53:58 [ 82s] -->> 20.248.137.48:443 (github.com) <<--
$ docker run --rm -it drwetter/testssl.sh:3.1dev --openssl /usr/bin/openssl github.com
Done 2023-04-03 22:03:49 [ 75s] -->> 20.248.137.48:443 (github.com) <<--
# March 31st:
$ docker run --rm -it drwetter/testssl.sh@sha256:6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e --openssl /usr/bin/openssl github.com
Done 2023-04-03 22:12:30 [ 92s] -->> 20.248.137.48:443 (github.com) <<--
# April 2nd:
$ docker run --rm -it drwetter/testssl.sh@sha256:363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 --openssl /usr/bin/openssl github.com
Done 2023-04-03 22:15:02 [ 75s] -->> 20.248.137.48:443 (github.com) <<-- Those are times that are better than I was getting locally 3 months ago for that same test. Even when compared to the Alpine 3.17 + OpenSSL 3.x image digest from March 31st, the results are still better. Establishing a reproductionClearly since you could reproduce the issue across multiple systems, there's an environment issue? Our Github Actions for CI is running on Ubuntu 22.04 runner, but I wouldn't expect that to make much difference. My host system is running with Docker Engine v23, Github Actions hasn't yet updated their CI runners to that AFAIK but it is planned. Could you try reproducing against If you don't get the slowdown there, it may have something to do with the sites you're checking against. Assuming you can get better times with CA certs no longer bundled into image?
I actually looked into this (unrelated to
Did your output from running I'd still be interested in knowing what the Line 14 in aa5235e
That might resolve the problem if it was related to CA certs / trust store. If that is not it, might be DNS related, but I have tried a before / after (with I also ran the March 31st digest (Alpine image) against that site and got 196 sec. No output difference there either. |
Thanks. Let's assume it's me - I appreciate your help, I've closed the issue out anyway. I haven't got a lot of time to do a lot of testing right now, but I did run those:
I've run the I'm running the tasks in Jenkins 2.387.1 as a simple Build Exec step, calling out to If I get time I'll try some of your other ideas. There is no difference in the output of specific site tests between the 'fast' results last week, and 'slow' results since the weekend - other than the time for any specific site suddenly taking 945s instead of 294s - it's literally the only difference in output. |
OK - potentially useful point: the only other difference in my tests from the Github ones above, is I am running in a 'batch file mode' with a list of sites, and a few extra args like The CPU has always got very high with the When I run against If I add If I remove If I run the same against github.com, it's not the same story though (with
In short: using |
Without
Same site with
I note 2 things: the lack of ciphers listed for TLS1.2 when using |
So uhh... my PC crashed when I almost finished responding.
You'll also notice when using Also, I remember mentioning that It'd probably work once it's copied over to the |
that's almost correct. I was thinking at some point to fill this option with more useful parts of the scan. --> Matter of 3.3dev. |
Cool. Well, I just wanted to point out that with Docker image SHA256 Then both the two subsequent Docker images So, I mean, it might be something about my site(s) with these new images, but the actual only variation here is the Docker image of testssl.sh. Something has changed in testssl.sh or some other dependency within the Docker image to cause it. I'd love to know why, because I don't like mysteries, but for now I'm just gonna drop Thanks both |
It doesn't reproduce as slower against the site when not using
I don't know Other than |
Here's a quick and dirty script:
Another site:
And a final one (my own site, which like Github.com etc, bucks the trend):
This seems to tell us - for the sites that this does affect:
To me this makes the older, March SHA more consistent with expectations (you would expect something called I reproduce this on a couple of sites such as securedrop.org and www.rcplondon.ac.uk and a number of others, all of which completely vary in their infrastructure (some are on Google Cloud in k8s clusters, others are at AWS London in ECS autoscale clusters, others are standalone EC2 instances in AWS Ireland) . Some of those I don't even have access to direct (but my client does, and I help with monitoring of them), so we can rule out any configuration that 'I' would typically apply on them. However, I indeed don't reproduce it on github.com, microsoft.com, cloudflare.com, and certain other sites that I host (such as mig5.net above). In those other sites, the newer Docker SHAs are faster than the older one, and on the older one, It seems to therefore be site-specific but always a reproducible pattern on the Docker SHAs for those sites. |
Thanks for providing more information 👍
@drwetter probably knows better what One difference with Alpine would have been it recently got OpenSSL 3.x with the Alpine 3.17 release, but prior was also on OpenSSL 1.x. It also performs DNS lookups differently IIRC (which was a compatibility problem with the old openssl binary from I don't know what you're running on your host, but you could probably try running UpdateI ran the tests on my end to
I'm not sure if Fixes:
|
|
You don't need |
That last conditional appears to be due to the host But as mentioned, this is not a bug in @mig5 if your tested website offers TLS 1.3 it'd avoid that branch and should work with Alternatively run without I got |
Indeed, I wondered if this might be the thing. My own website (mig5.net) supports TLS 1.3 and of course github.com etc, do too. The sites where I first noticed this, are using ALBs at Amazon that were on an older set of ciphersuites etc (TLS1.3 only became possible on ALBs last month!). securedrop.org is not one of those but is probably a similar story (slightly surprised, as it's using Cloudflare, but probably is a setting somewhere, yep) Thanks for all that debugging. For now I've dropped |
Versions affected
Docker SHA256s:
817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3
Command line / docker command to reproduce
Experienced behavior
On Docker SHA256
6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e
, as at March 31st 3:16AM UTC, scanning 21 sites took 7 minutes, 35 seconds.On Apr 1,
817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0
scanning the exact same list of sites at the same time as the previous day, was taking over 27 minutes - I aborted the test.On Apr 2nd, I reduced the number of sites to scan from 21 to 10, and used the latest Docker image
363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3
(my script always pulls the latest image before running). It took 16 minutes.On another system, scanning just 4 sites, the same pattern has occurred: test time went from 1 minute 44s to 7 minutes 38s with the same changes in Docker image version. And yet another system, scanning 6 sites went from 1m 53s to 10m. So I can rule out a problem with the first server running the Docker image - I've reproduced it on 3 separate servers (they are also testing different sites, so we can rule out the other side being the issue), the pattern is the Docker image change.
Expected behavior
I didn't expect the two most recent Docker images to see a drastic slow down in scanning the same sites I've been scanning every night on an automated basis for a couple years. Wondering if something changed in the most recent releases that would explain the slowdown?
Your system (please complete the following information):
Linux 5.4.0-146-generic x86_64
(as per Docker image)
The text was updated successfully, but these errors were encountered: