Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[馃悰 Bug]: Performance regression of Selenium 4.4 #10984

Closed
OndraM opened this issue Aug 28, 2022 · 13 comments
Closed

[馃悰 Bug]: Performance regression of Selenium 4.4 #10984

OndraM opened this issue Aug 28, 2022 · 13 comments

Comments

@OndraM
Copy link
Contributor

OndraM commented Aug 28, 2022

What happened?

Hi, I upgraded Selenium server to 4.4, and now I constantly get 5x or even over 20x performance decrease in comparison to 4.3.

This happens on our Github Actions build, and is even more significant on my local Linux machine (without Docker).

This is a typical result for set of ~15 tests ( = 15 new session requests followed by few simple commands) on GH Actions:
obrazek

Locally, the difference much more significant: from 5 seconds with 4.3.0 to 2 minutes with 4.4.0 (24x slower).

The test run is: https://github.com/OndraM/php-webdriver/actions/runs/2943219537

From what I noted from the Selenium log, with version 4.4, there is always a big delay before line Session request received by the Distributor, for example

13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.run] - Closed 1 connections out of 1 in 0 ms
13:35:15.392 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 

note the 15 seconds of no logged operation. While with 4.3.0 the events follow almost instantaneously, for example:

13:34:21.646 DEBUG [RequestConverter.channelRead0] - Closing input pipe.
13:34:21.661 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 

For the tests, PHP language bindings for WebDriver are used - but the handshake and everything should be W3C WebDriver compatible, and was working OK with Selenium 4.3.0.

Please let me know if I can provide any more information or somehow help tracing the issue.

How can we reproduce the issue?

Run Github actions on branch https://github.com/OndraM/php-webdriver/tree/test/selenium-4-performance

Or run locally:

  1. git clone -b test/selenium-4-performance git@github.com:OndraM/php-webdriver.git
  2. cd php-webdriver/
  3. composer install
  4. (in new terminal) java -jar selenium-server-4.4.0.jar standalone
  5. (in new terminal) php -S 127.0.0.1:8000 -t tests/functional/web/
  6. vendor/bin/phpunit --filter WebDriverByTest -v --debug

Relevant log output

I enabled the finest log level, here are the test runs:

(Selenium server output is dumped at the end, in the "Prints log" part - search for cat ./logs/selenium-server.log)

Operating System

Ubuntu (Github Actions), Arch (locally)

Selenium version

4.4.0 (jar server), php-webdriver 1.12.1

What are the browser(s) and version(s) where you see this issue?

Chrome 104 (headless)

What are the browser driver(s) and version(s) where you see this issue?

Chromedriver 104.0.5112.79

Are you using Selenium Grid?

4.4.0 in standalone mode

@github-actions
Copy link

@OndraM, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@titusfortner
Copy link
Member

Looking at the logs for this...
The "Session request received by the Distributor" is very quick in 4.3, and takes up to 15 seconds in 4.4

Not sure if it matters, but 4.3 doesn't have any of these DefaultChannelPool$IdleChannelDetector entries.

13:35:00.817 DEBUG [RequestConverter.channelRead0] - Incoming message: DefaultLastHttpContent(data: PooledSlicedByteBuf(ridx: 0, widx: 328, cap: 328/328, unwrapped: PooledUnsafeDirectByteBuf(ridx: 476, widx: 476, cap: 2048)), decoderResult: success)
13:35:00.817 DEBUG [RequestConverter.channelRead0] - Closing input pipe.
13:35:00.818 DEBUG [SpanWrappedHttpHandler.execute] - Wrapping request. Before OpenTelemetryContext{tracer=io.opentelemetry.sdk.trace.SdkTracer@5301193a, context={}, span id=0000000000000000, trace id=00000000000000000000000000000000} and after OpenTelemetryContext{tracer=io.opentelemetry.sdk.trace.SdkTracer@5301193a, context={opentelemetry-trace-span-key=SdkSpan{traceId=9e11407a8b9c375a9fbf9d0783df7920, spanId=16a39e934be186c6, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=session_queue, kind=INTERNAL, attributes=AttributesMap{data={random.key=1d6b8d48-a1c6-4b0c-9656-2bb08eb9a0ae}, capacity=128, totalAddedValues=1}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1661693700818000000, endEpochNanos=0}}, span id=16a39e934be186c6, trace id=9e11407a8b9c375a9fbf9d0783df7920}
13:35:00.818 DEBUG [HttpTracing.inject] - Injecting (POST) /session into OpenTelemetrySpan{traceId=9e11407a8b9c375a9fbf9d0783df7920,spanId=16a39e934be186c6} at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler:84
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.run] - Entry count for : http://localhost:43625 : 1
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.expiredChannels] - Adding Candidate expired Channel [id: 0xff143818, L:/127.0.0.1:36634 ! R:localhost/127.0.0.1:43625] isIdleTimeoutExpired=false isRemotelyClosed=true isTtlExpired=false
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.closeChannels] - Closing Idle Channel [id: 0xff143818, L:/127.0.0.1:36634 ! R:localhost/127.0.0.1:43625]
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.run] - Closed 1 connections out of 1 in 0 ms
13:35:15.392 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 

I threw the logs into gists for easier parsing:
https://gist.github.com/titusfortner/4fc1e999e66012a8c40caaa85682b785
https://gist.github.com/titusfortner/f6e6dda3e542f38c17a81a285cb1a1c9

@mikk150
Copy link

mikk150 commented Sep 20, 2022

also happens to me on 4.3.0

@AB-xdev
Copy link

AB-xdev commented Oct 5, 2022

So while comparing the changes between 4.3.0 and 4.4.0 I noticed that the DEFAULT_RETRY_INTERVAL was changed from 0 to 15
grafik
(was done in this commit)

This exactly matches the delay observed in the issue which is always around 15s

So just for fun I set the --session-retry-interval to 0 (as it was before) however the problem was still present.
When however setting it to 1 I was able to get a Server/Grid response within 2-3s instead of 15s.

So for now using --session-retry-interval 1 looks like a workaround.

@brondsem
Copy link

So will DEFAULT_RETRY_INTERVAL be changed back to 0 or near 0? Otherwise it seems a lot of people will have slow executions and not know why.

In the mean time, can --session-retry-interval 1 workaround be used in the selenium docker images somehow?

@cbonte
Copy link

cbonte commented Oct 14, 2022

@brondsem you can provide it with the SE_OPTS environment variable

@diemol
Copy link
Member

diemol commented Oct 17, 2022

Thanks for reporting this. In Docker you can set the value through SE_SESSION_RETRY_INTERVAL. I will make changes so this goes back to the previous behavior.

@therealdjryan
Copy link

therealdjryan commented Oct 17, 2022

I don't think modifying session-retry-interval addresses the speed discrepancy between 3.141 and 4.x

@diemol
Copy link
Member

diemol commented Oct 17, 2022

I don't think modifying session-retry-interval addresses the speed discrepancy between 3.141 and 4.x

Why not?

@titusfortner
Copy link
Member

This particular issue is about the specific change made in 4.4 that increased time for Session request received by the Distributor. If there is an additional speed-related issue to address, please open a new ticket with a reproducible case.

@victortanase
Copy link

@diemol when will be the fix available please?

@titusfortner
Copy link
Member

We're going to try to get a patch release out tomorrow.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants