doCallouts before checking ssl_bump step2 rules #108

chtsanti · 2020-11-11T19:00:04Z

During step2, our documentation promises a doCallouts() call
(step 2.2) after TLS ClientHello is parsed (step 2.1) and before
ssl_bump is revisited (step 2.3):

2.1: Get TLS Client Hello info from the client, including SNI
       where available. Adjust the CONNECT request from step1 to
       reflect SNI.

 2.2: Go through the Callout Sequence with the adjusted CONNECT
         request mentioned above.

 2.3: Evaluate again all ssl_bump rules and perform the first 
        matching action (splice, bump, peek, stare, or terminate).

However currently in Squid v5 (at least), step 2.3 happens immediately
after step 2.1, breaking essential step2-blocking-with-an-error-message
use cases.

This patch fixes master/v6 code to match the documentation.

This is a Measurement Factory project

The documentation says that for step1 we must follow the rules: - Get TLS Client Hello info from the client, including SNI where available. Adjust the CONNECT request from step1 to reflect SNI. - Go through the Callout Sequence with the adjusted CONNECT request mentioned above. - Evaluate again all ssl_bump rules and perform the first matching action (splice, bump, peek, stare, or terminate). The callouts on step2 are never called. This patch fixes this.

Use a CallJob to call ConnStateData::resumePeekAndSpliceStep2, the http object may destructed during this call causing segfauts to squid

This is may interpreted as a Callouts call after ssl_bump step2 rules are checked. This commit may reverted if we decide that this fake CONNECT request is a completelly new request and should not confused with bumped CONNECT request.

Currently this check is done by ConnStateData class. This patch removes the related ConnStateData code and allow DoCallouts run the ssl_bump access checks for tls bumping step step2

... at tls bumping step1. Documentation says: i. Get TCP-level info from the client. * In forward proxy environments, also parse the CONNECT request. * In interception environments, create a fake CONNECT request using TCP-level info. ii. Go through the Callout Sequence with the CONNECT request mentioned above. iii. Evaluate all ssl_bump rules and perform the first matching action (splice, bump, peek, stare, or terminate). But for intercepted environments the order was: - evaluate ssl_bump rules - create fake CONNECT request - Go through callout sequency

Squid documentation says about step2: - Get TLS Client Hello info from the client, including SNI where available. Adjust the CONNECT request from step1 to reflect SNI. - Go through the Callout Sequence with the adjusted CONNECT request mentioned above. - Evaluate again all ssl_bump rules and perform the first matching action (splice, bump, peek, stare, or terminate). * Peeking at this step usually makes bumping at step 3 impossible. * Staring at this step usually makes splicing at step 3 impossible. However current implementation: - Does not adjust CONNECT request from step1 retrieved info (SNI). - The fix/commit "TLS bumping fix: Perform a second doCallouts when step2 runs the Callouts for step2 but this is not done with a safe way: the calouts called twice for each ClientHttpRequest object. This is not the way the ClientHttpRequest is designed. This patch at step2 builds a new ClientHttpRequest object and run the callouts for this object. It uses the same ClientHttpRequest object to splice if required.

and initiateTunneledRequest ConnStateData methods It is needed only in one place. For this unique case set the ConnStateData::inBuf member directly.

…uest The commit "Fix ssl bumping step2 to follow ..." builds a new request with the adjusted SNI info, but all the other properties (headers, auth info, notes etc) from step1 request are lost. This patch instead of building a new HttpRequest object and try set its parameters, clone the old one and replace the host name in URI and host header, if the SNI info exist and if the CONNECT string has a IP address and not a DNS name. This is must be enough. Changes: - Add a new ConnStateData::buildFakeRequest variant which takes as argument an HttpRequest object and builds a new ClientHttpRequest object. - Now the ConnStateData::startPeekAndSpliceStep2 instead of calling the fakeAConnectRequest or initiateTunneledRequest methods clones the original HttpRequest object, adjust it and use the new ConnStateData::buildFakeRequest method

Current code while the certificates are generated checks if peek and splice is enabled at step1 to decide if should generate SSL/CTX objects or configure already generated SSL/CTX objects (inside ConnStateData::sslCrtdHandleReply and ConnStateData::getSslContextStart methods). This is wrong because we may have decide to peek/stare at step1 but the procedure break before we generate the SSLL/CTX objects. This branch tried to fix this wrongly by checking if we are enter the Step3 too. This patch just checks if we have already generate SSL/CTX objects and if yes reconfigure them with the generated certificates else generate new SSL/CTX objects based on generated certificates. Also this patch renames (and slightly fixes) the ConnStateData doPeekAndSpliceStep() method to restartTlsNegotiation(). Actually this is what it does.

Currently client blocks waiting for ever squid response and squid does not process the readed HttpRequest to serve the error page.

rousskov

I left a few small change requests, but the primary purpose of this review is to ask a few questions about the proposed changes.

rousskov · 2021-04-23T16:00:24Z

src/client_side.cc

+        // Disable caching if the SSL object exists and has attached a CTX object
+        if (!(fd_table[clientConnection->fd].ssl)) {
            sslBumpCertKey.clear();
            Ssl::InRamCertificateDbKey(certProperties, sslBumpCertKey);
            assert(!sslBumpCertKey.isEmpty());


// Disable caching if the SSL object exists and has attached a CTX object

GitHub does not let me quote more code below these lines, but I see nothing about disabling HTTP(??) caching in those lines. The literal description of the two combined if statements guards feels unhelpful at best. A developer can read the code and see that the description matches the code, but still have no clue what these conditions actually mean (i.e. when do we want to disable caching and why only under those conditions).

Please clarify:

What kind of caching are we disabling here? HTTP, TLS sessions, other? Which call inside the if statement will disable that caching? Is the primary purpose of that call to disable caching?

What do we think the if statement conditions mean? The old comment was implying it means we are bumping the client connection (or something like that), but you have removed that so I assume it was wrong.

The old comment was not clear either so I welcome the opportunity to clarify what is going on here.

What kind of caching are we disabling here? HTTP, TLS sessions, other? Which call inside the if statement will disable that caching? Is the primary purpose of that call to disable caching?

Instead of using a generated certificates cache, we are using a SSL_CTX objects cache.

In client-first and server-first bumping mode, when we are entering this code we have not generate yet the OpenSSL required structures to initiate TLS conversation. The required objects are the SSL_CTX (which is also holds server certificate) and then the SSL object (which is built based on SSL_CTX object). In this case we are getting from cache a SSL_CTX object and we are building the SSL object based on this SSL_CTX object

In Peek/Stare modes we are creating SSL_CTX/SSL objects in very early steps and although there is the SSL_set_SSL_CTX method which allow you to set an SSL_CTX object to an SSL object we have consider this procedure unsafe so we decided to regenerate certificates (or get it from helpers cache) and attach the generated certificates to the connection existing (and empty/unconfigured) SSL_CTX object.

(I think it worths to recheck if we can use a cached SSL_CTX object from cache for stare modes.)

What do we think the if statement conditions mean? The old comment was implying it means we are bumping the client connection (or something like that), but you have removed that so I assume it was wrong.

I changed the old comment in my commit 521148e. This is because I realized that checks for step1 decisions are not safe and it is better to just check if the SSL/SSL_CTX objects are configured.

I think the correct comment should be something like the following:
// Do not use SSL_CTX objects cache if the SSL/SSL_CTX objects pair for this connection are built

Please recheck my commit 28c3d4d for the updated comment.

Please recheck my commit 28c3d4d for the updated comment.

I appreciate your earlier explanation. I may be able to polish this code/comments (with your help) further, but I cannot find the 28c3d4d commit you mentioned above. Perhaps you have not pushed it to GitHub?

src/client_side.cc

rousskov · 2021-04-26T01:33:10Z

src/client_side_request.cc

+    auto srvBump = getConn()->serverBump();
+    // XXX: The following "if" works however the bumping step is not updated
+    // correctly by peek-and-splice code.
+    if (srvBump && srvBump->at(XactionStep::tlsBump2)) {
+        // Update request object
+        srvBump->request = request;
+        srvBump->act.step2 = sslBumpNeed_;
+        CallJobHere(85, 4, getConn(), ConnStateData, resumePeekAndSpliceStep2);
+        return;
+    }
+
    if (untouchedConnect && sslBumpNeeded()) {


If we can swap the first if with the second if, please swap them. It seems odd to check for the second step before checking whether the first step is needed.

I did not do this change. If it is important I will try to change it however.
IF I change it the code will look something like the following:

if ((!srvBump && untouchedConnect && sslBumpNeeded() )|| (srvBump && srvBump->at(XactionStep::tlsBump1))) { ... } else if (srvBump && srvBump->at(XactionStep::tlsBump2) { ... }

If we are at bumping step1, normally we have not create yet the srvBump object (this is done later by client_side.cc code ), or we have create one just to serve an error (eg an HTTP deny).

IF I change it, the code will look [bad because] If we are at bumping step1, normally we have not create yet the srvBump object (this is done later by client_side.cc code ), or we have create one just to serve an error (eg an HTTP deny).

if ((!srvBump && untouchedConnect && sslBumpNeeded() ) || (srvBump && srvBump->at(XactionStep::tlsBump1))) { ... sslBumpStart(); return; } if (srvBump && srvBump->at(XactionStep::tlsBump2)) {

Would it be OK to create the srvBump object as soon as we know that we may need to evaluate ssl_bump? It feels like there are many code places where you have to keep reminding me that srvBump may not exist in some special cases and, in those special cases, we need to checks something else to understand what is going on. Can we create srvBump in all those special cases instead?

After creating srvBump object ASAP, we can probably make sslBumpNeeded() return false when srvBump has been created already. AFAICT, sslBumpNeeded() was meant to determine future need for SslBump. If we already created srvBump, then srvBump should be consulted to determine the current need.

At the end, I would not be surprised if sslBumpNeed_ would be completely gone because we would just create srvBump instead of (remembering that we should create it and using sslBumpNeed_ to remember that future need). Creation of srvBump will signify the beginning of step1.

The code would then look like this:

if (untouchedConnect && srvBump && srvBump->at(XactionStep::tlsBump1)) { ... sslBumpStart(); return; } if (srvBump && srvBump->at(XactionStep::tlsBump2)) {

The untouchedConnect part still looks a bit odd, but we can deal with that later, I guess.

N.B. We can rename "server" in serverBump()/etc. if needed (so that it is not specific to "server" bumping), but that is a minor naming discrepancy thing that we can do later/separately. I am a lot more worried about all the if statements that are very difficult to interpret correctly when nil serverBump() may mean something other than "we are not doing SslBump in any shape or form".

rousskov · 2021-04-26T01:36:17Z

src/client_side_request.cc

+    // XXX: The following "if" works however the bumping step is not updated
+    // correctly by peek-and-splice code.
+    if (srvBump && srvBump->at(XactionStep::tlsBump2)) {


In a GitHub comment, please detail or paraphrase what that XXX is trying to warn us about, please. I am trying to understand whether the step is wrong when we get here or should be updated inside the if statement but is not. Also, please discuss why are we not fixing this XXX in this PR.

commit 214ffc8 removes this comment.

Also, please discuss why are we not fixing this XXX in this PR.

I consider it as out of scope.
This fix also forced some cleanup in bumping procedure, the commits ab757cf and ab757cf

rousskov · 2021-04-26T01:42:44Z

src/client_side_request.cc

            // set final error but delay sending until we bump
-            Ssl::ServerBump *srvBump = new Ssl::ServerBump(this, e, Ssl::bumpClientFirst);
+            if (Ssl::ServerBump *srvBump = getConn()->serverBump()) {


If this if statement can be moved outside and before the sslBumpNeeded() check, please do so. It feels a bit odd and more fragile to first check for the need to bump (i.e. for something that still needs to be done) and only then to check whether we have bumped already (i.e. something that has happened in the past).

commit 2c9e47c

Maybe not exactly what you want, but I hope it looks better.

rousskov · 2021-04-26T01:56:57Z

src/ssl/ServerBump.cc

+        storeUnregister(sc, entry, this);
+        entry->unlock("Ssl::ServerBump");
+    }
+    entry = e;


If possible, please add:

Suggested change

entry = e;

assert(e);

entry = e;

commit 35abb86

rousskov · 2021-04-26T01:58:44Z

src/ssl/ServerBump.cc

+        entry->unlock("Ssl::ServerBump");
+    }


I know this will not affect the way the code works today, but this code may disengage from the entry-setting code below it. We should not leave dangling pointers, especially when the same member or variable is reused later.

Suggested change

entry->unlock("Ssl::ServerBump");

}

entry->unlock("Ssl::ServerBump#storeEntryError");

entry = nullptr;

}

The related code is changed with commit 35abb86

rousskov · 2021-04-26T02:00:42Z

src/ssl/ServerBump.cc

+    if (entry) {
+        assert(sc);
+        storeUnregister(sc, entry, this);
+        entry->unlock("Ssl::ServerBump");


Suggested change

entry->unlock("Ssl::ServerBump");

entry->unlock("Ssl::ServerBump#storeEntryError");

commit 82d319e

rousskov · 2021-04-26T02:01:33Z

src/ssl/ServerBump.cc

+        entry->unlock("Ssl::ServerBump");
+    }
+    entry = e;
+    entry->lock("Ssl::ServerBump");


We should mark each lock (and each unlock) location in a unique way to avoid guesswork during triage. Squid code already uses the URL-like # symbol to separate the primary code context from its lower-level location detail IIRC.

Suggested change

entry->lock("Ssl::ServerBump");

entry->lock("Ssl::ServerBump#storeEntryError");

commit 82d319e

rousskov · 2021-04-26T02:06:04Z

src/client_side_request.cc

-        return false;
-    }
+    if (srvBump && srvBump->at(XactionStep::tlsBump2)) {
+        assert(http->request->method == Http::METHOD_CONNECT);


Please discuss what this assert() means, and why we must check this condition during step2.

Well, we "must" not check.
Looks that I add it to debug bad states, eg where the bumping step is not updated correctly, eg when the ConnStateData->sslServerBump->step remains at step2 after bump completed and the first tunneled request enters this code.

The ClientRequestContext::sslBumpAccessCheck is called for all HTTP requests enters the doCallouts. But if the current ConnStateData object marks bumping step2 means that this is a doCallouts for a CONNECT request at bumping step2.

I did not remove it (yet).

- Currently when client first bumping mode selected at step1 we are not creating an Ssl::ServerBump object for ConnStateData object. But we are creating when we have to serve an error page using the client-first bumping mode. This patch fixes the ConnStateData::switchToHttps to always builds a Ssl::ServerBump mode for all bumping modes including client-first mode. - Inside ConnStateData::switchToHttps fix the check for ConnStateData::sslServerBump. If this member is already set here means that we are serving an error using client-first bumping mode. - Inside Ssl::PeekingPeerConnector replace any check for ConnStateData::sslServerBump existance with must clauses. This is must be always set when the Ssl::PeekingPeerConnector is called.

Add this check inside Ssl::PeekingPeerConnector::initialize and remove any check for request.flags.sslPeek from any other place inside Ssl::PeekingPeerConnector. Recheck: Should the check be moved inside Ssl::PeekingPeerConnector constructor?

- Add the new tlsBumpDone XactionStep to mark as ConnStateData bumping step after the TLS bump is completed because of connection terminate, splice or client-side and server-side TLS connections estalished. This is may allow use of the sslBump steps with at_step acl in other than ssl_bump access lists. - If the bump-server-first is selected at step tlsBump1 then update the ConnStateData bumping step to tlsBump3. - Assure that the PeekingPeerConnector is called always after we entered bumping step tlsBump3

Bumping at step1 (using "bump" as action) must result to client-first bumping step.

... inside ClientHttpRequest::doCallouts

The Ssl::ServerBump::entry member always is not nil The SslServerBump:sc member always is not nil

... inside ConnStateData::getSslContextStart

chtsanti added 12 commits November 10, 2020 19:59

Fix crash at step2 bumping step

6b6f075

Use a CallJob to call ConnStateData::resumePeekAndSpliceStep2, the http object may destructed during this call causing segfauts to squid

Remove an unneeded comment line

b1c5366

Execute the step2 ssl_bump access check from doCallouts procedure

81184b5

Currently this check is done by ConnStateData class. This patch removes the related ConnStateData code and allow DoCallouts run the ssl_bump access checks for tls bumping step step2

Fix a buggy if which prevent (at least) transparent TLS bumping to work

038b69e

Remove the payload parameter from buildFakeRequest, fakeAConnectRequest

e059416

and initiateTunneledRequest ConnStateData methods It is needed only in one place. For this unique case set the ConnStateData::inBuf member directly.

Fix serving error from step2 callouts

e917002

Currently client blocks waiting for ever squid response and squid does not process the readed HttpRequest to serve the error page.

rousskov requested changes Apr 26, 2021

View reviewed changes

chtsanti added 9 commits May 11, 2021 19:44

Fix commit "Fix ConnStateData::sslServerBump creation and ..."

3884644

Bumping at step1 (using "bump" as action) must result to client-first bumping step.

Reorder an if-check for tls bumping after error

2c9e47c

... inside ClientHttpRequest::doCallouts

Fix Ssl::ServerBump::storeEntryError

35abb86

The Ssl::ServerBump::entry member always is not nil The SslServerBump:sc member always is not nil

Rename ServerBump::storeEntryError to ServerBump::resetStoreEntry

f3b245a

Adjust StoreEntry::[lock,unlock] message in ServerBump::resetStoreEntry

82d319e

Fix a comment about SSL_CTX objects cache

28c3d4d

... inside ConnStateData::getSslContextStart

chtsanti mentioned this pull request Nov 2, 2021

Use a new PeekingPeerConnector::currentBumpingMode member #139

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doCallouts before checking ssl_bump step2 rules #108

doCallouts before checking ssl_bump step2 rules #108

chtsanti commented Nov 11, 2020

rousskov left a comment

rousskov Apr 23, 2021

chtsanti May 10, 2021

chtsanti May 15, 2021

rousskov May 24, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov May 24, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

rousskov Apr 26, 2021

chtsanti May 15, 2021

-        entry->unlock("Ssl::ServerBump");
-    }
+        entry->unlock("Ssl::ServerBump#storeEntryError");
+        entry = nullptr;
+    }

	entry->unlock("Ssl::ServerBump");
	entry->unlock("Ssl::ServerBump#storeEntryError");

	entry->lock("Ssl::ServerBump");
	entry->lock("Ssl::ServerBump#storeEntryError");

doCallouts before checking ssl_bump step2 rules #108

Are you sure you want to change the base?

doCallouts before checking ssl_bump step2 rules #108

Conversation

chtsanti commented Nov 11, 2020

rousskov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment