Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[馃悰 Bug]: Sessions are stuck for hours #11593

Closed
arnonax-tr opened this issue Jan 26, 2023 · 6 comments
Closed

[馃悰 Bug]: Sessions are stuck for hours #11593

arnonax-tr opened this issue Jan 26, 2023 · 6 comments

Comments

@arnonax-tr
Copy link
Contributor

arnonax-tr commented Jan 26, 2023

What happened?

Occasionally we have issues with sessions that are not closed and hang for hours. Recently we saw sessions that were open for few hours (most of our tests run in term of few minutes; We don't have tests that run much longer). Because we have many Jenkins jobs running tests from many test projects on the same grid, it's hard to determine exactly who are the tests that their sessions were stuck, but there were about 10 of them (while there were about additional 10 active, short, sessions out of 100 used slots)

Anyway, we looked at the logs of one of the nodes that had a problem, and saw the errors shown below.

Note: we don't specify --session-timeout, so AFAIU, if the sessions would just hang because the tests didn't close them properly, the grid should have killed them after 5 minutes.

How can we reproduce the issue?

N/A

Relevant log output

Starting ChromeDriver 106.0.5249.61 (511755355844955cd3e264779baf0dd38212a4d0-refs/branch-heads/5249@{#569}) on port 23415
 Only local connections are allowed.
 Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
 ChromeDriver was started successfully.
 09:50:13.908 INFO [LocalNode.newSession] - Session created by the Node. Id: d3b0f64c4696f5911482580aaa89630e, Caps: Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 106.0.5249.119, chrome: {chromedriverVersion: 106.0.5249.61 (511755355844 
 09:50:15.076 INFO [ProxyNodeWebsockets.createWsEndPoint] - Establishing connection to ws://localhost:46061/devtools/browser/ea6377a5-82d9-4881-9167-eb73155f2147
 09:50:37.591 WARN [DefaultChannelPipeline.onUnhandledInboundException] - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
 org.openqa.selenium.WebDriverException: java.lang.IllegalArgumentException: statusCode
 Build info: version: '4.5.3', revision: '4b786a1e430'
 System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.4.0-1091-azure', java.version: '11.0.16'
 Driver info: driver.version: unknown
     at org.openqa.selenium.remote.http.jdk.JdkHttpClient$5.send(JdkHttpClient.java:232)
     at org.openqa.selenium.netty.server.WebSocketMessageHandler.lambda$channelRead0$0(WebSocketMessageHandler.java:47)
     at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
     at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
     at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
     at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566)
     at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
     at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
     at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
     at java.base/java.lang.Thread.run(Thread.java:829)
 Caused by: java.lang.IllegalArgumentException: statusCode
     at java.net.http/jdk.internal.net.http.websocket.WebSocketImpl.sendClose(WebSocketImpl.java:301)
     at org.openqa.selenium.remote.http.jdk.JdkHttpClient$5.lambda$send$2(JdkHttpClient.java:215)
     at org.openqa.selenium.remote.http.jdk.JdkHttpClient$5.send(JdkHttpClient.java:222)
     ... 9 more
 09:50:37.988 INFO [SessionSlot.stop] - Stopping session d3b0f64c4696f5911482580aaa89630e
 Stream closed EOF for automation/selenium-selenium-chrome-6447685758-2jsbw (selenium)

Operating System

Linux, docker

Selenium version

.Net 6.0, Selenium.WebDriver 4.6.0

What are the browser(s) and version(s) where you see this issue?

Chrome v.106.0

What are the browser driver(s) and version(s) where you see this issue?

ChromeDriver 106

Are you using Selenium Grid?

Selenium Grid 4.5.0 (revision fe167b1)

@github-actions
Copy link

@arnonax-tr, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@diemol
Copy link
Member

diemol commented Jan 26, 2023

There was a fix for that in 4.7.0 I believe. Please update to the most recent version and report back.

@arnonax-tr
Copy link
Contributor Author

Thanks @diemol .
We upgraded our Selenium Grid to 4.8.0, but apparently it didn't solve the problem.

Attached is the log of the node of a stuck session. The session Id is cf2785d638c98a2b352fb5abdcda5230.
stuckjob.node.log

I got also the logs of the Hub, but apparently it starts only after this session failed, so it doesn't help much. When it will happen again I'll try to get it sooner.

One more important thing I noticed: Most of the sessions that got stuck started exactly at the same time. From our side it makes sense because when a test cycle starts it runs many tests in parallel, so all of them start at the same time, but it still doesn't explain why they are stuck like that.

@diemol
Copy link
Member

diemol commented Feb 1, 2023

@arnonax-tr you did not upgrade to 4.8.0. The log shows version 4.5.3:
(apologies for copying and pasting an image, copying the text for some reason did not work).

image

Please upgrade and let us know.

@diemol
Copy link
Member

diemol commented Mar 15, 2023

Closing as we did not get more information.

@diemol diemol closed this as completed Mar 15, 2023
Copy link

github-actions bot commented Dec 9, 2023

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants