You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 21, 2021. It is now read-only.
After analysing the debug logs I suspect the "copying files" to be the cause/trigger.
Befor the cleanup process stops working you have following log entries several times:
Then the "copying files" throws an warning (the video from that node has been copied allready):
21:19:44.493 [http://172.30.0.11:40761] WARN d.z.e.z.p.DockerSeleniumRemoteProxy - Error while copying the video
java.nio.file.FileAlreadyExistsException: /home/seluser/videos/zalenium_SELPWKESL01_GRID_firefox_LINUX_20200101211850539_COMPLETED.mp4
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
at java.nio.file.Files.newOutputStream(Files.java:216)
at java.nio.file.Files.copy(Files.java:3016)
at de.zalando.ep.zalenium.proxy.DockerSeleniumRemoteProxy.copyVideos(DockerSeleniumRemoteProxy.java:644)
at de.zalando.ep.zalenium.proxy.DockerSeleniumRemoteProxy.processContainerAction(DockerSeleniumRemoteProxy.java:614)
at de.zalando.ep.zalenium.proxy.DockerSeleniumRemoteProxy.processContainerAction(DockerSeleniumRemoteProxy.java:601)
at de.zalando.ep.zalenium.proxy.DockerSeleniumRemoteProxy.videoRecording(DockerSeleniumRemoteProxy.java:565)
at de.zalando.ep.zalenium.proxy.DockerSeleniumRemoteProxy.cleanupNode(DockerSeleniumRemoteProxy.java:759)
at de.zalando.ep.zalenium.proxy.DockerSeleniumRemoteProxy.afterSession(DockerSeleniumRemoteProxy.java:400)
at org.openqa.grid.internal.TestSlot.performAfterSessionEvent(TestSlot.java:253)
at de.zalando.ep.zalenium.registry.ZaleniumRegistry._release(ZaleniumRegistry.java:193)
at de.zalando.ep.zalenium.registry.ZaleniumRegistry.lambda$0(ZaleniumRegistry.java:177)
at java.lang.Thread.run(Thread.java:748)
After that warning no "checking containers" log entries pop up anymore.
And here the logs for a node:
16:29:26.788 [http://172.27.0.11:40001] DEBUG d.z.e.z.c.DockerContainerClient - http://172.27.0.11:40001 video-rec: stopped
--40001LOG 16:29:23:986912588 Trapped SIGTERM or SIGINT so shutting down ffmpeg gracefully...
--40001LOG 16:29:23:991593194 Will kill VID_TOOL_PID=647 ...
--40001LOG 16:29:24:998601310 Tried to kill -SIGTERM VID_TOOL_PID=647
--40001LOG 16:29:25:003496489 Waiting up to 6s for VID_TOOL_PID=647 to end with SIGTERM...
--40001LOG 16:29:25:012840069 wait_pid successfully managed to SIGTERM:VID_TOOL_PID=647 within less than 6s
--40001LOG 16:29:25:017483677 Will try to fix the videos...
--40001LOG 16:29:25:530947578 Fixing perms for /home/seluser/videos/vid_chrome_or_firefox_40001.mkv*
--40001LOG 16:29:25:542825162 Changing video encoding from mkv to mp4...
--40001LOG 16:29:26:467943099 Conversion from mkv to mp4 succeeded!
--40001LOG 16:29:26:472634368 Cleaning up /home/seluser/videos/vid_chrome_or_firefox_40001.mkv ...
--40001LOG 16:29:26:479091184 Optimizing /home/seluser/videos/vid_chrome_or_firefox_40001.mp4 for HTTP streaming...
--40001LOG 16:29:26:500525487 Succeeded to mp4box_retry.sh within 8s
--40001LOG 16:29:26:505331225 ffmpeg shutdown complete.
Input #0, matroska,webm, from '/home/seluser/videos/vid_chrome_or_firefox_40001.mkv':
Output #0, mp4, to '/home/seluser/videos/vid_chrome_or_firefox_40001.mp4':
16:29:26.940 [http://172.27.0.11:40001] DEBUG d.z.e.z.p.DockerSeleniumRemoteProxy - Video file copied to: /home/seluser/videos/zalenium_SELPWKESL01_GRID_firefox_LINUX_20191120162848196_COMPLETED.mp4
16:29:26.944 [http://172.27.0.11:40001] DEBUG d.z.e.z.c.DockerContainerClient - http://172.27.0.11:40001 [bash, -c, transfer-logs.sh]
16:29:27.052 [http://172.27.0.11:40001] DEBUG d.z.e.z.c.DockerContainerClient - http://172.27.0.11:40001 bash: transfer-logs.sh: command not found
16:29:27.110 [http://172.27.0.11:40001] DEBUG d.z.e.z.p.DockerSeleniumRemoteProxy - Logs copied to: /home/seluser/videos/logs/zalenium_SELPWKESL01_GRID_firefox_LINUX_20191120162848196_COMPLETED
16:29:27.116 [http://172.27.0.11:40001] DEBUG d.z.e.z.c.DockerContainerClient - http://172.27.0.11:40001 [bash, -c, cleanup-container.sh]
16:29:27.277 [http://172.27.0.11:40001] DEBUG d.z.e.z.c.DockerContainerClient - http://172.27.0.11:40001
this part is not happening after the above descriped error which normally does for "working" nodes:
16:29:27.943 [http://172.27.0.11:40001] DEBUG d.z.e.z.p.DockerSeleniumRemoteProxy - Proxy has reached max test sessions.
16:29:27.943 [http://172.27.0.11:40001] DEBUG d.z.e.z.p.DockerSeleniumRemoteProxy - Shutting down node due to proxy has reached max test sessions.
16:29:30.348 [http://172.27.0.11:40001] DEBUG d.z.e.z.p.DockerSeleniumRemoteProxy - Marking the node as down because it was stopped after 1 tests.
16:29:30.348 [http://172.27.0.11:40001] DEBUG d.z.e.z.registry.ZaleniumRegistry - Cleaning up stale test sessions on the unregistered node http://172.27.0.11:40001
16:29:30.348 [http://172.27.0.11:40001] DEBUG d.z.e.z.proxy.AutoStartProxySet - Stopping removed container [http://172.27.0.11:40001
16:29:30.352 [http://172.27.0.11:40001] DEBUG d.z.e.z.p.DockerSeleniumRemoteProxy - stopPolling() deactivated
16:29:30.994 [AutoStartProxyPoolPoller] DEBUG d.z.e.z.c.DockerContainerClient - Container ContainerCreationStatus [isCreated=true, containerName=zalenium_bpjaWa, containerId=e22214646faed56e2016a86b53b8955bf1e3edb21bb4036630091119cc5c49b3, nodePort=40001] not found - it is terminated.
Workaround
Restarting the hub
Possible Solution
surround the copying file command with an if "file exists" and not only if the folder exists (DockerSeleniumRemoteProxy:658)
The text was updated successfully, but these errors were encountered:
🐛 Bug Report
After some time(it varies from 1 - 30 days) the zalenium hub stops to cleanup its nodes.
To Reproduce
Cant be reproduced/triggered manually.
Expected behavior
After a test run, the node should be cleaned up and removed by the hub
Environment
OS: centos 7.6.1810
Zalenium Image Version(s): 3.141.59v
Docker version: Docker version 18.09.1, build 4c52b90
Docker-Compose version (if applicable): docker-compose version 1.23.2, build 1110ad01
Usage: two sequences every 5 minutes
Exact Docker command to start the containers:
Detailed Description:
After analysing the debug logs I suspect the "copying files" to be the cause/trigger.
Befor the cleanup process stops working you have following log entries several times:
Then the "copying files" throws an warning (the video from that node has been copied allready):
After that warning no "checking containers" log entries pop up anymore.
And here the logs for a node:
this part is not happening after the above descriped error which normally does for "working" nodes:
Workaround
Restarting the hub
Possible Solution
surround the copying file command with an if "file exists" and not only if the folder exists (DockerSeleniumRemoteProxy:658)
The text was updated successfully, but these errors were encountered: