Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: 4.29.0-2025030 - Java Error #2722

Closed
extravio opened this issue Mar 21, 2025 · 10 comments
Closed

[🐛 Bug]: 4.29.0-2025030 - Java Error #2722

extravio opened this issue Mar 21, 2025 · 10 comments

Comments

@extravio
Copy link

What happened?

Hi there,

We're updating from 4.24.0-20240907 (chart: 0.35.2 ) to 4.29.0-20250303 (chart: 0.40.1). Tests that were working previously are now failing.
See logs of the Edge Node below.

Tested with the default values.yaml.

Any ideas?

Command used to start Selenium Grid with Docker (or Kubernetes)

Deploying the helm chart, with default values.

Relevant log output

01:17:32.538 INFO [Node.<init>] - Binding additional locator mechanisms: relative
01:17:32.898 INFO [NodeServer$2.start] - Starting registration process for Node http://172.25.52.21:5555
01:17:32.899 INFO [NodeServer.execute] - Started Selenium node 4.29.0 (revision 18ae989): http://172.25.52.21:5555
01:17:32.906 INFO [NodeServer$2.lambda$start$2] - Sending registration event...
01:17:33.040 INFO [NodeServer.lambda$createHandlers$2] - Node has been added
2025-03-21 01:17:33,041 INFO spawned: 'novnc' with pid 118
2025-03-21 01:17:33,045 WARN exited: novnc (exit status 0; not expected)
2025-03-21 01:17:36,049 INFO spawned: 'novnc' with pid 119
2025-03-21 01:17:36,053 WARN exited: novnc (exit status 0; not expected)
2025-03-21 01:17:36,053 INFO gave up: novnc entered FATAL state, too many start retries too quickly
01:18:35.994 INFO [LocalNode.newSession] - Session created by the Node. Id: 54e1beae2c32571df3043254f5fe38df, Caps: Capabilities {acceptInsecureCerts: false, browserName: MicrosoftEdge, browserVersion: 133.0.3065.92, fedcm:accounts: true, ms:edgeOptions: {debuggerAddress: localhost:40051}, msedge: {msedgedriverVersion: 133.0.3065.92 (e8591772eda3..., userDataDir: /tmp/.com.microsoft.Edge.qJ...}, networkConnectionEnabled: false, pageLoadStrategy: normal, platformName: linux, proxy: Proxy(), se:bidiEnabled: false, se:cdp: wss://dotc-selenium-grid-de..., se:cdpVersion: 133.0.3065.92, se:containerName: selenium-grid-selenium-node..., setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:credBlob: true, webauthn:extension:largeBlob: true, webauthn:extension:minPinLength: true, webauthn:extension:prf: true, webauthn:virtualAuthenticators: true}
01:21:36.027 WARN [SpanWrappedHttpHandler.execute] - Unable to execute request: Build info: version: '4.29.0', revision: '18ae989'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.15.0-1081-azure', java.version: '21.0.6'
Driver info: driver.version: unknown
org.openqa.selenium.WebDriverException: Build info: version: '4.29.0', revision: '18ae989'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.15.0-1081-azure', java.version: '21.0.6'
Driver info: driver.version: unknown
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:419)
	at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:54
[...]
Caused by: java.lang.InterruptedException
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:386)
[...]
01:29:02.619 INFO [LocalNode.stopTimedOutSession] - Session id 54e1beae2c32571df3043254f5fe38df timed out, stopping...
01:29:03.684 INFO [SessionSlot.stop] - Stopping session 54e1beae2c32571df3043254f5fe38df

Operating System

AKS

Docker Selenium version (image tag)

4.29.0-2025030

Selenium Grid chart version (chart version)

0.40.1

Copy link

@extravio, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@VietND96
Copy link
Member

Hi, what is the session timeout set in Node? Does any step have idle time that gets over that timeout?

@extravio
Copy link
Author

Hi @VietND96,
On 4.24.0-20240907, that particular test takes about 10s to complete (it's just to confirm Selenium Grid is working as expected).
On 4.29.0-20250303, the test doesn't get processed, so it hits the timeout.
I cannot see any other error in the logs of the other components.
Any idea where the problem could be coming from?

@VietND96
Copy link
Member

Can you share how client code looks like?

@Mannyid
Copy link

Mannyid commented Mar 21, 2025

I'm seeing the same issue with my deployment on 4.29.0-20250222. Has there been any fix for this ?

@VietND96
Copy link
Member

I saw 2 things

  • Chart deployment, you mentioned that Deploying the helm chart, with default values.
  • In Hub logs, I can see requests capabilities within platformName: linux - that means your client code calling setPlatformName (or similarly in different bindings).

To fix this, can you set to chart values, under each node, key hpa.platformName, set it to linux also for matching with request capabilities (by default it is empty).
This is an improvement in Grid scaler, which I documented detail the usage here https://github.com/SeleniumHQ/docker-selenium/tree/trunk/charts/selenium-grid#scaler-trigger-configuration

@extravio
Copy link
Author

extravio commented Mar 23, 2025

Hi @VietND96 ,

ok, I found the problem. The app label on the nodes has been modified in this PR:
https://github.com/SeleniumHQ/docker-selenium/pull/2475/files#diff-81a82f068405c6e0c78f9c712e58508ee92d328bcd4329cfdb7f7c3b25c11282

example: selenium-grid-selenium-chrome-node -> selenium-grid-selenium-node-chrome

We've got network policies based on the app labels and we need to update them.

Thanks!

@Mannyid
Copy link

Mannyid commented Mar 24, 2025

the naming conversion is pretty long, so if I modify the naming to just say Hub or chrome or edge, will I run into this issue?

@VietND96
Copy link
Member

I guess it based on your network policies implemented in the cluster.
For naming conventions, in chart, you can freely to change via config key nameOverride

@Mannyid
Copy link

Mannyid commented Mar 26, 2025

Ok so how i fixed this error was to increase the time on the proxytimeout on the ingress

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants