Skip to content

[🐛 Bug]: Retry of hub on TimeoutException does perform Command multiple times #12975

@joerg1985

Description

@joerg1985

What happened?

The retry handling of the hub might execute a command twice, after the client already failed.
The hub will retry sending a command to the node on TimeoutException, although the driver did execute the command the first time and is only waiting for the page to load. See #12558 for a potential real world occurrence of this issue.

I think the TimeoutException should not be handled by the retry filter of the hub, as the command has allready been send to the driver. The other retrys are okay in my mind, e.g. on ConnectionTimeout.

The default timeouts should also be updated, the default w3c page load timeout is 300s and the default read timeout of the client is 180s. In general it might make sense to have a slightly higher read time on the client as on hub the to ensure the error generated by the hub has some time to be send to the client. The same should also be considered for the timeout of the node to driver communication.

How can we reproduce the issue?

** this will only work with a hub/node setup, the standalone does not use http internally to handle requests **
** run the node with -Djdk.httpclient.HttpClient.log=headers to see the requests send to the driver **

This dummy code will provoke the issue after ~5 min, the chromedriver will wait after the click up to 300s for the page to load. The http client on the client will timeout after 180s and raise a TimeoutException. The http client on the hub will also raise a TimeoutException after 180s and retry to send the command to the driver.
The click is than performed again, look at second call to '../click' in output of the node.


import java.net.URL;
import java.time.Duration;
import java.time.LocalTime;
import java.time.temporal.ChronoUnit;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.openqa.selenium.By;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.apache.hc.core5.http.impl.bootstrap.HttpServer;
import org.apache.hc.core5.http.impl.bootstrap.ServerBootstrap;
import org.apache.hc.core5.http.io.SocketConfig;
import org.apache.hc.core5.http.io.entity.HttpEntities;
import org.apache.hc.core5.util.TimeValue;
import org.openqa.selenium.TimeoutException;
import org.openqa.selenium.chrome.ChromeOptions;


public class RetryClick {

    private static final Logger LOG = Logger.getLogger(TEts.class.getName());

    public static void main(String[] args) throws Exception {
        final SocketConfig socketConfig = SocketConfig.custom()
                .setSoTimeout(5, TimeUnit.MINUTES)
                .setTcpNoDelay(true)
                .build();
        
        CountDownLatch latch = new CountDownLatch(3);

        final HttpServer httpServer = ServerBootstrap.bootstrap()
                .setListenerPort(8080)
                .setSocketConfig(socketConfig)
                // Application request handler
                .register("/index.html", (request, response, context) -> {
                    LOG.info("/index.html was called");
                    latch.countDown();

                    if (latch.getCount() < 2) { // do not delay the first call
                        // delay more than read timeout and less than pageloaded timeout
                        try {
                            Thread.sleep(184 * 1000);
                        } catch (InterruptedException e) {
                            System.exit(666);
                        }
                    }

                    response.setCode(200);
                    response.setHeader("content-type", "text/html; charset=utf-8");
                    response.setEntity(HttpEntities.create("<html><body><h1><a href='index.html'>Please wait!</a></h1></body></html>"));
                })
                .create();

        httpServer.start();
        RemoteWebDriver driver = new RemoteWebDriver(new URL("http://localhost:4444"), new ChromeOptions());

        // 300s is the default value of the W3C spec
        driver.manage().timeouts().pageLoadTimeout(Duration.ofSeconds(300));

        LOG.info("call the url");
        driver.get("http://127.0.0.1:8080/index.html");

        try {
            LOG.info("do the click");
            driver.findElement(By.tagName("a")).click();
        } catch (TimeoutException e) {
            LOG.info("look at the output of the hub for the second attemp to click");
            Thread.sleep(60 * 8 * 1000);
        } finally {
            // wait for the logger to print all messages
            Thread.sleep(1000);
            driver.close();
            httpServer.close();
            httpServer.awaitTermination(TimeValue.MAX_VALUE);
        }
    }
}

Relevant log output

12:03:50.654 INFO [Http1Request.logHeaders] - HEADERS: REQUEST HEADERS:
POST /session/50c92665cdb2fb5f70a460b9705a8487/element/93C59E4F5009CACA09C5764D606F21A7_element_3/click HTTP/1.1
Connection: Upgrade, HTTP2-Settings
Content-Length: 56
Host: localhost:27975
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
Upgrade: h2c
Cache-Control: no-cache
Content-Type: application/json; charset=utf-8
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
traceparent: 00-1c2f2267e7cbdb6e65e170f9f08fe771-f173474c1f3156cc-01
User-Agent: selenium/4.14.1 (java windows)

12:06:50.675 INFO [Http1Request.logHeaders] - HEADERS: REQUEST HEADERS:
POST /session/50c92665cdb2fb5f70a460b9705a8487/element/93C59E4F5009CACA09C5764D606F21A7_element_3/click HTTP/1.1
Connection: Upgrade, HTTP2-Settings
Content-Length: 56
Host: localhost:27975
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
Upgrade: h2c
Cache-Control: no-cache
Content-Type: application/json; charset=utf-8
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
traceparent: 00-1c2f2267e7cbdb6e65e170f9f08fe771-94909dcd0e14cb05-01
User-Agent: selenium/4.14.1 (java windows)

12:06:54.746 INFO [Http1Response.lambda$readHeadersAsync$0] - HEADERS: RESPONSE HEADERS:
    cache-control: no-cache
    content-length: 14
    content-type: application/json; charset=utf-8

12:06:54.753 INFO [Http1Response.lambda$readHeadersAsync$0] - HEADERS: RESPONSE HEADERS:
    cache-control: no-cache
    content-length: 1076
    content-type: application/json; charset=utf-8

Operating System

Windows 10 x64

Selenium version

4.14.1

What are the browser(s) and version(s) where you see this issue?

chrome 118

What are the browser driver(s) and version(s) where you see this issue?

all

Are you using Selenium Grid?

4.14.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    B-gridEverything grid and server relatedI-defectSomething is not working as intended

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions