Bug 5055: FATAL FwdState::noteDestinationsEnd exception: opening (#967)

* Bug 5055: FATAL FwdState::noteDestinationsEnd exception: opening The bug was caused by commit 25b0ce4. Other known symptoms are: assertion failed: store.cc:1793: "isEmpty()" assertion failed: FwdState.cc:501: "serverConnection() == conn" assertion failed: FwdState.cc:1037: "!opening()" This change has several overlapping parts. Unfortunately, merging individual parts is both difficult and likely to cause crashes. ## Part 1: Bug 5055. FwdState used to check serverConn to decide whether to open a connection to forward the request. Since commit 25b0ce4, a nil serverConn pointer no longer implies that a new connection should be opened: FwdState helper jobs may already be working on preparing an existing open connection (e.g., sending a CONNECT request or negotiating encryption). Bad serverConn checks in both FwdState::noteDestination() and FwdState::noteDestinationsEnd() methods led to extra connectStart() calls creating two conflicting concurrent helper jobs. To fix this, we replaced direct serverConn inspection with a usingDestination() call which also checks whether we are waiting for a helper job. Testing that fix exposed another set of bugs: The helper job pointers or in-job connections left stale/set after forwarding failures. The changes described below addressed (most of) those problems. ## Part 2: Connection establishing helper jobs and their callbacks A proper fix for Bug 5055 required answering a difficult question: When should a dying job call its callbacks? We only found one answer which required cooperation from the job creator and led to the following rules: * AsyncJob destructors must not call any callbacks. * AsyncJob::swanSong() is responsible for async-calling any remaining (i.e. set, uncalled, and uncancelled) callbacks. * AsyncJob::swanSong() is called (only) for started jobs. * AsyncJob destructing sequence should validate that no callbacks remain uncalled for started jobs. ... where an AsyncJob x is considered "started" if AsyncJob::Start(x) has returned without throwing. A new JobWait class helps job creators follow these rules while keeping track on in-progress helper jobs and killing no-longer-needed helpers. Also fixed very similar bugs in tunnel.cc code. ## Part 3: ConnOpener fixes 1. Many ConnOpener users are written to keep a ConnectionPointer to the destination given to ConnOpener. This means that their connection magically opens when ConnOpener successfully connects, before ConnOpener has a chance to notify the user about the changes. Having multiple concurrent connection owners is always dangerous, and the user cannot even have a close handler registered for its now-open connection. When something happens to ConnOpener or its answer, the user job may be in trouble. Now, ConnOpener callers no longer pass Connection objects they own, cloning them as needed. That adjustment required adjustment 2: 2. Refactored ConnOpener users to stop assuming that the answer contains a pointer to their connection object. After adjustment 1 above, it does not. HappyConnOpener relied on that assumption quite a bit so we had to refactor to use two custom callback methods instead of one with a complicated if-statement distinguishing prime from spare attempts. This refactoring is an overall improvement because it simplifies the code. Other ConnOpener users just needed to remove a few no longer valid paranoid assertions/Musts. 3. ConnOpener users were forced to remember to close params.conn when processing negative answers. Some, naturally, forgot, triggering warnings about orphaned connections (e.g., Ident and TcpLogger). ConnOpener now closes its open connection before sending a negative answer. 4. ConnOpener would trigger orphan connection warnings if the job ended after opening the connection but without supplying the connection to the requestor (e.g., because the requestor has gone away). Now, ConnOpener explicitly closes its open connection if it has not been sent to the requestor. Also fixed Comm::ConnOpener::cleanFd() debugging that was incorrectly saying that the method closes the temporary descriptor. Also fixed ConnOpener callback's syncWithComm(): The stale CommConnectCbParams override was testing unused (i.e. always negative) CommConnectCbParams::fd and was trying to cancel the callback that most (possibly all) recipients rely on: ConnOpener users expect a negative answer rather than no answer at all. Also, after comparing the needs of two old/existing and a temporary added ("clone everything") Connection cloning method callers, we decided there is no need to maintain three different methods. All existing callers should be fine with a single method because none of them suffers from "extra" copying of members that others need. Right now, the new cloneProfile() method copies everything except FD and a few special-purpose members (with documented reasons for not copying). Also added Comm::Connection::cloneDestinationDetails() debugging to simplify tracking dependencies between half-baked Connection objects carrying destination/flags/other metadata and open Connection objects created by ConnOpener using that metadata (which are later delivered to ConnOpener users and, in some cases, replace those half-baked connections mentioned earlier. Long-term, we need to find a better way to express these and other Connection states/stages than comments and debugging messages. ## Part 4: Comm::Connection closure callbacks We improved many closure callbacks to make sure (to the extent possible) that Connection and other objects are in sync with Comm. There are lots of small bugs, inconsistencies, and other problems in Connection closure handlers. It is not clear whether any of those problems could result in serious runtime errors or leaks. In theory, the rest of the code could neutralize their negative side effects. However, even in that case, it was just a matter of time before the next bug will bite us due to stale Connection::fd and such. These changes themselves carry elevated risk, but they get us closer to reliable code as far as Connection maintenance is concerned; otherwise, we will keep chasing their deadly side effects. Long-term, all these manual efforts to keep things in sync should become unnecessary with the introduction of appropriate Connection ownership APIs that automatically maintain the corresponding environments (TODO). ## Part 5: Other notable improvements in the adjusted code Improved Security::PeerConnector::serverConn and Http::Tunneler::connection management, especially when sending negative answers. When sending a negative answer, we would set answer().conn to an open connection, async-send that answer, and then hurry to close the connection using our pointer to the shared Connection object. If everything went according to the plan, the recipient would get a non-nil but closed Connection object. Now, a negative answer simply means no connection at all. Same for a tunneled answer. Refactored ICAP connection-establishing code to to delay Connection ownership until the ICAP connection is fully ready. This change addresses primary Connection ownership concerns (as they apply to this ICAP code) except orphaning of the temporary Connection object by helper job start exceptions (now an explicit XXX). For example, the transaction no longer shares a Connection object with ConnOpener and IcapPeerConnector jobs. Probably fixed a bug where PeerConnector::negotiate() assumed that a sslFinalized() does not return true after callBack(). It may (e.g., when CertValidationHelper::Submit() throws). Same for PeekingPeerConnector::checkForPeekAndSpliceMatched(). Fixed FwdState::advanceDestination() bug that did not save ERR_GATEWAY_FAILURE details and "lost" the address of that failed destination, making it unavailable to future retries (if any). Polished PeerPoolMgr, Ident, and Gopher code to be able to fix similar job callback and connection management issues. Polished AsyncJob::Start() API. Start() used to return a job pointer, but that was a bad idea: * It implies that a failed Start() will return a nil pointer, and that the caller should check the result. Neither is true. * It encourages callers to dereference the returned pointer to further adjust the job. That technically works (today) but violates the rules of communicating with an async job. The Start() method is the boundary after which the job is deemed asynchronous. Also removed old "and wait for..." post-Start() comments because the code itself became clear enough, and those comments were becoming increasingly stale (because they duplicated the callback names above them). Fix Tunneler and PeerConnector handling of last-resort callback requirements. Events like handleStopRequest() and callException() stop the job but should not be reported as a BUG (e.g., it would be up to the callException() to decide how to report the caught exception). There might (or there will) be other, similar cases where the job is stopped prematurely for some non-BUG reason beyond swanSong() knowledge. The existence of non-bug cases does not mean there could be no bugs worth reporting here, but until they can be identified more reliably than all these benign/irrelevant cases, reporting no BUGs is a (much) lesser evil. TODO: Revise AsyncJob::doneAll(). Many of its overrides are written to check for both positive (i.e. mission accomplished) and negative (i.e. mission cancelled or cannot be accomplished) conditions, but the latter is usually unnecessary, especially after we added handleStopRequest() API to properly support external job cancellation events. Many doneAll() overrides can probably be greatly simplified. ---- Cherry picked SQUID-568-premature-serverconn-use-v5 commit 22b5f78. * fixup: Cherry-picked SQUID-568-premature-serverconn-use PR-time fixes In git log order: * e64a6c1: Undone an out-of-scope change and added a missing 'auto' * aeaf83d: Fixed an 'unused parameter' error * f49d009: fixup: No explicit destructors with implicit copying methods * c30c37f: Removed an unnecessary explicit copy constructor * 012f5ec: Excluded Connection::rfc931 from cloning * 366c78a: ICAP: do not set connect_timeout on the established conn... This branch is now in sync with SQUID-568-premature-serverconn-use (S) commit e64a6c1 (except for official changes merged from master/v6 into S closer to the end of PR 877 work (i.e. S' merge commit 0a7432a). * Fix FATAL ServiceRep::putConnection exception: theBusyConns > 0 FATAL: check failed: theBusyConns > 0 exception location: ServiceRep.cc(163) putConnection Since master/v6 commit 2b6b1bc, a timeout on a ready-to-shovel Squid-service ICAP connection was decrementing theBusyConns level one extra time because Adaptation::Icap::Xaction::noteCommTimedout() started calling both noteConnectionFailed() and closeConnection(). Depending on the actual theBusyConns level, the extra decrement could result in FATAL errors later, when putConnection() was called (for a different ICAP transaction) with zero theBusyConns in an exception-unprotected context. Throughout these changes, Xaction still counts the above timeouts as a service failure. That is done by calling ServiceRep::noteFailure() from Xaction::callException(), including in timeout cases described above. ---- Cherry-picked master/v6 commit a8ac892.
squid-cache · Feb 8, 2022 · 1332f8d · 1332f8d
1 parent 6037006
commit 1332f8d
Show file tree

Hide file tree

Showing 60 changed files with 991 additions and 568 deletions.
diff --git a/src/BodyPipe.cc b/src/BodyPipe.cc
@@ -335,6 +335,7 @@ BodyPipe::startAutoConsumptionIfNeeded()
         return;
 
     theConsumer = new BodySink(this);
+    AsyncJob::Start(theConsumer);
     debugs(91,7, HERE << "starting auto consumption" << status());
     scheduleBodyDataNotification();
 }

diff --git a/src/CommCalls.cc b/src/CommCalls.cc
@@ -53,6 +53,9 @@ CommAcceptCbParams::CommAcceptCbParams(void *aData):
 {
 }
 
+// XXX: Add CommAcceptCbParams::syncWithComm(). Adjust syncWithComm() API if all
+// implementations always return true.
+
 void
 CommAcceptCbParams::print(std::ostream &os) const
 {
@@ -72,13 +75,24 @@ CommConnectCbParams::CommConnectCbParams(void *aData):
 bool
 CommConnectCbParams::syncWithComm()
 {
-    // drop the call if the call was scheduled before comm_close but
-    // is being fired after comm_close
-    if (fd >= 0 && fd_table[fd].closing()) {
-        debugs(5, 3, HERE << "dropping late connect call: FD " << fd);
-        return false;
+    assert(conn);
+
+    // change parameters if this is a successful callback that was scheduled
+    // after its Comm-registered connection started to close
+
+    if (flag != Comm::OK) {
+        assert(!conn->isOpen());
+        return true; // not a successful callback; cannot go out of sync
     }
-    return true; // now we are in sync and can handle the call
+
+    assert(conn->isOpen());
+    if (!fd_table[conn->fd].closing())
+        return true; // no closing in progress; in sync (for now)
+
+    debugs(5, 3, "converting to Comm::ERR_CLOSING: " << conn);
+    conn->noteClosure();
+    flag = Comm::ERR_CLOSING;
+    return true; // now the callback is in sync with Comm again
 }
 
 /* CommIoCbParams */

diff --git a/src/Downloader.cc b/src/Downloader.cc
@@ -81,6 +81,10 @@ void
 Downloader::swanSong()
 {
     debugs(33, 6, this);
+
+    if (callback_) // job-ending emergencies like handleStopRequest() or callException()
+        callBack(Http::scInternalServerError);
+
     if (context_) {
         context_->finished();
         context_ = nullptr;
@@ -251,6 +255,7 @@ Downloader::downloadFinished()
 void
 Downloader::callBack(Http::StatusCode const statusCode)
 {
+    assert(callback_);
     CbDialer *dialer = dynamic_cast<CbDialer*>(callback_->getDialer());
     Must(dialer);
     dialer->status = statusCode;

diff --git a/src/FwdState.cc b/src/FwdState.cc
@@ -207,26 +207,22 @@ FwdState::stopAndDestroy(const char *reason)
 {
     debugs(17, 3, "for " << reason);
 
-    if (opening())
-        cancelOpening(reason);
+    cancelStep(reason);
 
     PeerSelectionInitiator::subscribed = false; // may already be false
     self = nullptr; // we hope refcounting destroys us soon; may already be nil
     /* do not place any code here as this object may be gone by now */
 }
 
-/// Notify connOpener that we no longer need connections. We do not have to do
-/// this -- connOpener would eventually notice on its own, but notifying reduces
-/// waste and speeds up spare connection opening for other transactions (that
-/// could otherwise wait for this transaction to use its spare allowance).
+/// Notify a pending subtask, if any, that we no longer need its help. We do not
+/// have to do this -- the subtask job will eventually end -- but ending it
+/// earlier reduces waste and may reduce DoS attack surface.
 void
-FwdState::cancelOpening(const char *reason)
+FwdState::cancelStep(const char *reason)
 {
-    assert(calls.connector);
-    calls.connector->cancel(reason);
-    calls.connector = nullptr;
-    notifyConnOpener();
-    connOpener.clear();
+    transportWait.cancel(reason);
+    encryptionWait.cancel(reason);
+    peerWait.cancel(reason);
 }
 
 #if STRICT_ORIGINAL_DST
@@ -348,8 +344,7 @@ FwdState::~FwdState()
 
     entry = NULL;
 
-    if (opening())
-        cancelOpening("~FwdState");
+    cancelStep("~FwdState");
 
     if (Comm::IsConnOpen(serverConn))
         closeServerConnection("~FwdState");
@@ -501,8 +496,17 @@ FwdState::fail(ErrorState * errorState)
     if (!errorState->request)
         errorState->request = request;
 
-    if (err->type != ERR_ZERO_SIZE_OBJECT)
-        return;
+    if (err->type == ERR_ZERO_SIZE_OBJECT)
+        reactToZeroSizeObject();
+
+    destinationReceipt = nullptr; // may already be nil
+}
+
+/// ERR_ZERO_SIZE_OBJECT requires special adjustments
+void
+FwdState::reactToZeroSizeObject()
+{
+    assert(err->type == ERR_ZERO_SIZE_OBJECT);
 
     if (pconnRace == racePossible) {
         debugs(17, 5, HERE << "pconn race happened");
@@ -566,6 +570,8 @@ FwdState::complete()
 
         if (Comm::IsConnOpen(serverConn))
             unregister(serverConn);
+        serverConn = nullptr;
+        destinationReceipt = nullptr;
 
         storedWholeReply_ = nullptr;
         entry->reset();
@@ -584,6 +590,12 @@ FwdState::complete()
     }
 }
 
+bool
+FwdState::usingDestination() const
+{
+    return encryptionWait || peerWait || Comm::IsConnOpen(serverConn);
+}
+
 void
 FwdState::markStoredReplyAsWhole(const char * const whyWeAreSure)
 {
@@ -613,19 +625,19 @@ FwdState::noteDestination(Comm::ConnectionPointer path)
 
     destinations->addPath(path);
 
-    if (Comm::IsConnOpen(serverConn)) {
+    if (usingDestination()) {
         // We are already using a previously opened connection, so we cannot be
-        // waiting for connOpener. We still receive destinations for backup.
-        Must(!opening());
+        // waiting for it. We still receive destinations for backup.
+        Must(!transportWait);
         return;
     }
 
-    if (opening()) {
+    if (transportWait) {
         notifyConnOpener();
         return; // and continue to wait for FwdState::noteConnection() callback
     }
 
-    // This is the first path candidate we have seen. Create connOpener.
+    // This is the first path candidate we have seen. Use it.
     useDestinations();
 }
 
@@ -653,19 +665,19 @@ FwdState::noteDestinationsEnd(ErrorState *selectionError)
     // if all of them fail, forwarding as whole will fail
     Must(!selectionError); // finding at least one path means selection succeeded
 
-    if (Comm::IsConnOpen(serverConn)) {
+    if (usingDestination()) {
         // We are already using a previously opened connection, so we cannot be
-        // waiting for connOpener. We were receiving destinations for backup.
-        Must(!opening());
+        // waiting for it. We were receiving destinations for backup.
+        Must(!transportWait);
         return;
     }
 
-    Must(opening()); // or we would be stuck with nothing to do or wait for
+    Must(transportWait); // or we would be stuck with nothing to do or wait for
     notifyConnOpener();
     // and continue to wait for FwdState::noteConnection() callback
 }
 
-/// makes sure connOpener knows that destinations have changed
+/// makes sure connection opener knows that the destinations have changed
 void
 FwdState::notifyConnOpener()
 {
@@ -674,7 +686,7 @@ FwdState::notifyConnOpener()
     } else {
         debugs(17, 7, "notifying about " << *destinations);
         destinations->notificationPending = true;
-        CallJobHere(17, 5, connOpener, HappyConnOpener, noteCandidatesChange);
+        CallJobHere(17, 5, transportWait.job(), HappyConnOpener, noteCandidatesChange);
     }
 }
 
@@ -684,7 +696,7 @@ static void
 fwdServerClosedWrapper(const CommCloseCbParams &params)
 {
     FwdState *fwd = (FwdState *)params.data;
-    fwd->serverClosed(params.fd);
+    fwd->serverClosed();
 }
 
 /**** PRIVATE *****************************************************************/
@@ -754,13 +766,23 @@ FwdState::checkRetriable()
 }
 
 void
-FwdState::serverClosed(int fd)
+FwdState::serverClosed()
 {
-    // XXX: fd is often -1 here
-    debugs(17, 2, "FD " << fd << " " << entry->url() << " after " <<
-           (fd >= 0 ? fd_table[fd].pconn.uses : -1) << " requests");
-    if (fd >= 0 && serverConnection()->fd == fd)
-        fwdPconnPool->noteUses(fd_table[fd].pconn.uses);
+    // XXX: This method logic attempts to tolerate Connection::close() called
+    // for serverConn earlier, by one of our dispatch()ed jobs. If that happens,
+    // serverConn will already be closed here or, worse, it will already be open
+    // for the next forwarding attempt. The current code prevents us getting
+    // stuck, but the long term solution is to stop sharing serverConn.
+    debugs(17, 2, serverConn);
+    if (Comm::IsConnOpen(serverConn)) {
+        const auto uses = fd_table[serverConn->fd].pconn.uses;
+        debugs(17, 3, "prior uses: " << uses);
+        fwdPconnPool->noteUses(uses); // XXX: May not have come from fwdPconnPool
+        serverConn->noteClosure();
+    }
+    serverConn = nullptr;
+    closeHandler = nullptr;
+    destinationReceipt = nullptr;
     retryOrBail();
 }
 
@@ -802,6 +824,8 @@ FwdState::handleUnregisteredServerEnd()
 {
     debugs(17, 2, HERE << "self=" << self << " err=" << err << ' ' << entry->url());
     assert(!Comm::IsConnOpen(serverConn));
+    serverConn = nullptr;
+    destinationReceipt = nullptr;
     retryOrBail();
 }
 
@@ -819,6 +843,8 @@ FwdState::advanceDestination(const char *stepDescription, const Comm::Connection
     } catch (...) {
         debugs (17, 2, "exception while trying to " << stepDescription << ": " << CurrentException);
         closePendingConnection(conn, "connection preparation exception");
+        if (!err)
+            fail(new ErrorState(ERR_GATEWAY_FAILURE, Http::scInternalServerError, request, al));
         retryOrBail();
     }
 }
@@ -830,8 +856,7 @@ FwdState::noteConnection(HappyConnOpener::Answer &answer)
 {
     assert(!destinationReceipt);
 
-    calls.connector = nullptr;
-    connOpener.clear();
+    transportWait.finish();
 
     Must(n_tries <= answer.n_tries); // n_tries cannot decrease
     n_tries = answer.n_tries;
@@ -843,6 +868,8 @@ FwdState::noteConnection(HappyConnOpener::Answer &answer)
         Must(!Comm::IsConnOpen(answer.conn));
         answer.error.clear(); // preserve error for errorSendComplete()
     } else if (!Comm::IsConnOpen(answer.conn) || fd_table[answer.conn->fd].closing()) {
+        // The socket could get closed while our callback was queued. Sync
+        // Connection. XXX: Connection::fd may already be stale/invalid here.
         // We do not know exactly why the connection got closed, so we play it
         // safe, allowing retries only for persistent (reused) connections
         if (answer.reused) {
@@ -912,23 +939,24 @@ FwdState::establishTunnelThruProxy(const Comm::ConnectionPointer &conn)
     if (!conn->getPeer()->options.no_delay)
         tunneler->setDelayId(entry->mem_obj->mostBytesAllowed());
 #endif
-    AsyncJob::Start(tunneler);
-    // and wait for the tunnelEstablishmentDone() call
+    peerWait.start(tunneler, callback);
 }
 
 /// resumes operations after the (possibly failed) HTTP CONNECT exchange
 void
 FwdState::tunnelEstablishmentDone(Http::TunnelerAnswer &answer)
 {
+    peerWait.finish();
+
     ErrorState *error = nullptr;
     if (!answer.positive()) {
-        Must(!Comm::IsConnOpen(answer.conn));
+        Must(!answer.conn);
         error = answer.squidError.get();
         Must(error);
         answer.squidError.clear(); // preserve error for fail()
     } else if (!Comm::IsConnOpen(answer.conn) || fd_table[answer.conn->fd].closing()) {
-        // The socket could get closed while our callback was queued.
-        // We close Connection here to sync Connection::fd.
+        // The socket could get closed while our callback was queued. Sync
+        // Connection. XXX: Connection::fd may already be stale/invalid here.
         closePendingConnection(answer.conn, "conn was closed while waiting for tunnelEstablishmentDone");
         error = new ErrorState(ERR_CANNOT_FORWARD, Http::scServiceUnavailable, request, al);
     } else if (!answer.leftovers.isEmpty()) {
@@ -998,18 +1026,21 @@ FwdState::secureConnectionToPeer(const Comm::ConnectionPointer &conn)
 #endif
         connector = new Security::BlindPeerConnector(requestPointer, conn, callback, al, sslNegotiationTimeout);
     connector->noteFwdPconnUse = true;
-    AsyncJob::Start(connector); // will call our callback
+    encryptionWait.start(connector, callback);
 }
 
 /// called when all negotiations with the TLS-speaking peer have been completed
 void
 FwdState::connectedToPeer(Security::EncryptorAnswer &answer)
 {
+    encryptionWait.finish();
+
     ErrorState *error = nullptr;
     if ((error = answer.error.get())) {
-        Must(!Comm::IsConnOpen(answer.conn));
+        assert(!answer.conn);
         answer.error.clear(); // preserve error for errorSendComplete()
     } else if (answer.tunneled) {
+        assert(!answer.conn);
         // TODO: When ConnStateData establishes tunnels, its state changes
         // [in ways that may affect logging?]. Consider informing
         // ConnStateData about our tunnel or otherwise unifying tunnel
@@ -1019,6 +1050,8 @@ FwdState::connectedToPeer(Security::EncryptorAnswer &answer)
         complete(); // destroys us
         return;
     } else if (!Comm::IsConnOpen(answer.conn) || fd_table[answer.conn->fd].closing()) {
+        // The socket could get closed while our callback was queued. Sync
+        // Connection. XXX: Connection::fd may already be stale/invalid here.
         closePendingConnection(answer.conn, "conn was closed while waiting for connectedToPeer");
         error = new ErrorState(ERR_CANNOT_FORWARD, Http::scServiceUnavailable, request, al);
     }
@@ -1091,22 +1124,20 @@ FwdState::connectStart()
     Must(!request->pinnedConnection());
 
     assert(!destinations->empty());
-    assert(!opening());
+    assert(!usingDestination());
 
     // Ditch error page if it was created before.
     // A new one will be created if there's another problem
     delete err;
     err = nullptr;
     request->clearError();
-    serverConn = nullptr;
-    destinationReceipt = nullptr;
 
     request->hier.startPeerClock();
 
-    calls.connector = asyncCall(17, 5, "FwdState::noteConnection", HappyConnOpener::CbDialer<FwdState>(&FwdState::noteConnection, this));
+    AsyncCall::Pointer callback = asyncCall(17, 5, "FwdState::noteConnection", HappyConnOpener::CbDialer<FwdState>(&FwdState::noteConnection, this));
 
     HttpRequest::Pointer cause = request;
-    const auto cs = new HappyConnOpener(destinations, calls.connector, cause, start_t, n_tries, al);
+    const auto cs = new HappyConnOpener(destinations, callback, cause, start_t, n_tries, al);
     cs->setHost(request->url.host());
     bool retriable = checkRetriable();
     if (!retriable && Config.accessList.serverPconnForNonretriable) {
@@ -1118,8 +1149,7 @@ FwdState::connectStart()
     cs->setRetriable(retriable);
     cs->allowPersistent(pconnRace != raceHappened);
     destinations->notificationPending = true; // start() is async
-    connOpener = cs;
-    AsyncJob::Start(cs);
+    transportWait.start(cs, callback);
 }
 
 /// send request on an existing connection dedicated to the requesting client