-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🐛 Bug]: Grid is going down everyday and I need to manually restart the hub #14467
Comments
@abhi-iac, thank you for creating this issue. We will troubleshoot it as soon as we can. Info for maintainersTriage this issue by using labels.
If information is missing, add a helpful comment and then
If the issue is a question, add the
If the issue is valid but there is no time to troubleshoot it, consider adding the
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable
After troubleshooting the issue, please add the Thank you! |
There is no information we can use to triage this. In addition, you are using Selenium 4.4. Please update to the latest (4.24) on both tests and Grid and open a new issue with the complete template filled out if you still face the problem. |
Even if it is the version discrepancy then it should throw the error rather it is throwing error like "The HTTP request to the remote WebDriver server for URL http://selenium-grid.xxxxxxxx.com:4444/wd/hub/session timed out after 120 seconds. ---> System.Net.WebException: The request was aborted: The operation has timed out" we have timeout set to default 300. Do you have any better explanation? |
4.4 was released two years ago, and the Grid has had many bug fixes and changes. I cannot give you an explanation why, my assumption is that session creation took too long and the client timed out. |
We are seeing this error due to Selenium grid itself was down randomly that is the reason we are seeing this error. But really can't able to determine why the grid is going down. Anyway, thank you for your reply. We will debug it more and see. |
Digged in more and added logs to distributor, router, sessionqueue, session and event-bus services on my server. As I mentioned I can see these logs everyday at specific time on router and distributor services. With your expertise have you saw something similar in past? PS: I know I haven't updated the grid version, there is dependency on this grid for some automation pipelines 04:08:12.823 WARN [SeleniumSpanExporter$1.lambda$export$1] - Unable to execute request for an existing session: Unable to find session with ID: 62a591c058a29e59e9dcc254adb27b28 04:08:12.823 WARN [SeleniumSpanExporter$1.lambda$export$3] - {"traceId": "7506a231bc88ffffed52b53f4822f12b","eventTime": 1726474092826299300,"eventName": "exception","attributes": {"exception.message": "Unable to execute request for an existing session: Unable to find session with ID: 62a591c058a29e59e9dcc254adb27b28\nBuild info: version: '4.4.0', revision: 'e5c75ed026a'\nSystem info: host: 'xxxxxxx', ip: 'xxxxxxx', os.name: 'Windows Server 2019', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.16.1'\nDriver info: driver.version: unknown\nBuild info: version: '4.4.0', revision: 'e5c75ed026a'\nSystem info: host: 'xxxxxxx', ip: 'xxxxxxx', os.name: 'Windows Server 2019', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.16.1'\nDriver info: driver.version: unknown","exception.stacktrace": "org.openqa.selenium.NoSuchSessionException: Unable to find session with ID: 62a591c058a29e59e9dcc254adb27b28\nBuild info: version: '4.4.0', revision: 'e5c75ed026a'\nSystem info: host: 'xxxxxxx', ip: 'xxxxxxx', os.name: 'Windows Server 2019', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.16.1'\nDriver info: driver.version: unknown\nBuild info: version: '4.4.0', revision: 'e5c75ed026a'\nSystem info: host: 'xxxxxxx', ip: 'xxxxxxx', os.name: 'Windows Server 2019', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.16.1'\nDriver info: driver.version: unknown\r\n\tat java.base\u002fjdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\r\n\tat java.base\u002fjdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\r\n\tat java.base\u002fjdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\r\n\tat java.base\u002fjava.lang.reflect.Constructor.newInstance(Constructor.java:490)\r\n\tat org.openqa.selenium.remote.ErrorCodec.decode(ErrorCodec.java:134)\r\n\tat org.openqa.selenium.grid.web.Values.get(Values.java:48)\r\n\tat org.openqa.selenium.grid.sessionmap.remote.RemoteSessionMap.makeRequest(RemoteSessionMap.java:119)\r\n\tat org.openqa.selenium.grid.sessionmap.remote.RemoteSessionMap.get(RemoteSessionMap.java:90)\r\n\tat org.openqa.selenium.grid.router.HandleSession.lambda$loadSessionId$4(HandleSession.java:159)\r\n\tat io.opentelemetry.context.Context.lambda$wrap$2(Context.java:224)\r\n\tat org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:122)\r\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\r\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat org.openqa.selenium.grid.router.Router.execute(Router.java:91)\r\n\tat org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)\r\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\r\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat org.openqa.selenium.remote.http.Route$NestedRoute.handle(Route.java:270)\r\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)\r\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\r\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\r\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\r\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\r\n\tat org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)\r\n\tat java.base\u002fjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\r\n\tat java.base\u002fjava.util.concurrent.FutureTask.run(FutureTask.java:264)\r\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\r\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\r\n\tat java.base\u002fjava.lang.Thread.run(Thread.java:829)\r\n","exception.type": "org.openqa.selenium.NoSuchSessionException","http.flavor": 1,"http.handler_class": "org.openqa.selenium.grid.router.HandleSession","http.host": "selenium-grid.unum.com:4444","http.method": "POST","http.request_content_length": "58","http.scheme": "HTTP","http.target": "\u002fsession\u002f62a591c058a29e59e9dcc254adb27b28\u002felement","http.user_agent": "selenium\u002f4.13.0 (.net windows)","session.id": "62a591c058a29e59e9dcc254adb27b28"}} Distributor If you see the time frame after "04:16:03.649" distributor was up because I have killed the task Distributor and Router. I have sufficient memory allocated for all the services. |
Hello guys do anyone had similar issue? Unable to find the resolution it is been a while |
After upgrading the grid to 4.24.0 and added few arguments in distributor and router. Grid looks stable now |
This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs. |
What happened?
Everyday at specific timeframe my grid is going down, making to fail the all the pipelines. Grid is good on weekends as no pipelines are running against it.
I am really unsure why the grid is going down even though there are sufficient sessions avaialable. While running the pipeline the error shows "Cannot find session with id: 92598c7905b9a20bfcbe25414a79b6de Build info: version: '4.4.0', revision: 'e5c75ed026a' System info: host: 'xxxx', ip: 'x.x.x.x', os.name: 'Windows Server 2019', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.16.1' Driver info: driver.version: unknown"
But once the pipeline completes the error is "OpenQA.Selenium.WebDriverException : The HTTP request to the remote WebDriver server for URL http://selenium-grid.unum.com:4444/wd/hub/session/2044c12416348b5c4f61ca618922146d/element timed out after 120 seconds"
I found these other bugs but those are on Docker
SeleniumHQ/docker-selenium#2135
#14322
How can we reproduce the issue?
This is happening at specific time
Relevant log output
Operating System
Windows server 2019 Datacenter
Selenium version
openjdk 11.0.16.1 2022-08-12 LTS
What are the browser(s) and version(s) where you see this issue?
Chrome 128.0.6613.85
What are the browser driver(s) and version(s) where you see this issue?
ChromeDriver.128.0.6613.84
Are you using Selenium Grid?
Selenium 4.152369
The text was updated successfully, but these errors were encountered: