New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doCallouts before checking ssl_bump step2 rules #108
base: master
Are you sure you want to change the base?
Conversation
The documentation says that for step1 we must follow the rules: - Get TLS Client Hello info from the client, including SNI where available. Adjust the CONNECT request from step1 to reflect SNI. - Go through the Callout Sequence with the adjusted CONNECT request mentioned above. - Evaluate again all ssl_bump rules and perform the first matching action (splice, bump, peek, stare, or terminate). The callouts on step2 are never called. This patch fixes this.
Use a CallJob to call ConnStateData::resumePeekAndSpliceStep2, the http object may destructed during this call causing segfauts to squid
This is may interpreted as a Callouts call after ssl_bump step2 rules are checked. This commit may reverted if we decide that this fake CONNECT request is a completelly new request and should not confused with bumped CONNECT request.
Currently this check is done by ConnStateData class. This patch removes the related ConnStateData code and allow DoCallouts run the ssl_bump access checks for tls bumping step step2
... at tls bumping step1. Documentation says: i. Get TCP-level info from the client. * In forward proxy environments, also parse the CONNECT request. * In interception environments, create a fake CONNECT request using TCP-level info. ii. Go through the Callout Sequence with the CONNECT request mentioned above. iii. Evaluate all ssl_bump rules and perform the first matching action (splice, bump, peek, stare, or terminate). But for intercepted environments the order was: - evaluate ssl_bump rules - create fake CONNECT request - Go through callout sequency
Squid documentation says about step2: - Get TLS Client Hello info from the client, including SNI where available. Adjust the CONNECT request from step1 to reflect SNI. - Go through the Callout Sequence with the adjusted CONNECT request mentioned above. - Evaluate again all ssl_bump rules and perform the first matching action (splice, bump, peek, stare, or terminate). * Peeking at this step usually makes bumping at step 3 impossible. * Staring at this step usually makes splicing at step 3 impossible. However current implementation: - Does not adjust CONNECT request from step1 retrieved info (SNI). - The fix/commit "TLS bumping fix: Perform a second doCallouts when step2 runs the Callouts for step2 but this is not done with a safe way: the calouts called twice for each ClientHttpRequest object. This is not the way the ClientHttpRequest is designed. This patch at step2 builds a new ClientHttpRequest object and run the callouts for this object. It uses the same ClientHttpRequest object to splice if required.
and initiateTunneledRequest ConnStateData methods It is needed only in one place. For this unique case set the ConnStateData::inBuf member directly.
…uest The commit "Fix ssl bumping step2 to follow ..." builds a new request with the adjusted SNI info, but all the other properties (headers, auth info, notes etc) from step1 request are lost. This patch instead of building a new HttpRequest object and try set its parameters, clone the old one and replace the host name in URI and host header, if the SNI info exist and if the CONNECT string has a IP address and not a DNS name. This is must be enough. Changes: - Add a new ConnStateData::buildFakeRequest variant which takes as argument an HttpRequest object and builds a new ClientHttpRequest object. - Now the ConnStateData::startPeekAndSpliceStep2 instead of calling the fakeAConnectRequest or initiateTunneledRequest methods clones the original HttpRequest object, adjust it and use the new ConnStateData::buildFakeRequest method
Current code while the certificates are generated checks if peek and splice is enabled at step1 to decide if should generate SSL/CTX objects or configure already generated SSL/CTX objects (inside ConnStateData::sslCrtdHandleReply and ConnStateData::getSslContextStart methods). This is wrong because we may have decide to peek/stare at step1 but the procedure break before we generate the SSLL/CTX objects. This branch tried to fix this wrongly by checking if we are enter the Step3 too. This patch just checks if we have already generate SSL/CTX objects and if yes reconfigure them with the generated certificates else generate new SSL/CTX objects based on generated certificates. Also this patch renames (and slightly fixes) the ConnStateData doPeekAndSpliceStep() method to restartTlsNegotiation(). Actually this is what it does.
Currently client blocks waiting for ever squid response and squid does not process the readed HttpRequest to serve the error page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few small change requests, but the primary purpose of this review is to ask a few questions about the proposed changes.
src/client_side.cc
Outdated
// Disable caching if the SSL object exists and has attached a CTX object | ||
if (!(fd_table[clientConnection->fd].ssl)) { | ||
sslBumpCertKey.clear(); | ||
Ssl::InRamCertificateDbKey(certProperties, sslBumpCertKey); | ||
assert(!sslBumpCertKey.isEmpty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Disable caching if the SSL object exists and has attached a CTX object
GitHub does not let me quote more code below these lines, but I see nothing about disabling HTTP(??) caching in those lines. The literal description of the two combined if
statements guards feels unhelpful at best. A developer can read the code and see that the description matches the code, but still have no clue what these conditions actually mean (i.e. when do we want to disable caching and why only under those conditions).
Please clarify:
-
What kind of caching are we disabling here? HTTP, TLS sessions, other? Which call inside the
if
statement will disable that caching? Is the primary purpose of that call to disable caching? -
What do we think the
if
statement conditions mean? The old comment was implying it means we are bumping the client connection (or something like that), but you have removed that so I assume it was wrong.
The old comment was not clear either so I welcome the opportunity to clarify what is going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What kind of caching are we disabling here? HTTP, TLS sessions, other? Which call inside the if statement will disable that caching? Is the primary purpose of that call to disable caching?
Instead of using a generated certificates cache, we are using a SSL_CTX objects cache.
In client-first and server-first bumping mode, when we are entering this code we have not generate yet the OpenSSL required structures to initiate TLS conversation. The required objects are the SSL_CTX (which is also holds server certificate) and then the SSL object (which is built based on SSL_CTX object). In this case we are getting from cache a SSL_CTX object and we are building the SSL object based on this SSL_CTX object
In Peek/Stare modes we are creating SSL_CTX/SSL objects in very early steps and although there is the SSL_set_SSL_CTX method which allow you to set an SSL_CTX object to an SSL object we have consider this procedure unsafe so we decided to regenerate certificates (or get it from helpers cache) and attach the generated certificates to the connection existing (and empty/unconfigured) SSL_CTX object.
(I think it worths to recheck if we can use a cached SSL_CTX object from cache for stare modes.)
- What do we think the if statement conditions mean? The old comment was implying it means we are bumping the client connection (or something like that), but you have removed that so I assume it was wrong.
I changed the old comment in my commit 521148e. This is because I realized that checks for step1 decisions are not safe and it is better to just check if the SSL/SSL_CTX objects are configured.
I think the correct comment should be something like the following:
// Do not use SSL_CTX objects cache if the SSL/SSL_CTX objects pair for this connection are built
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please recheck my commit 28c3d4d for the updated comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please recheck my commit 28c3d4d for the updated comment.
I appreciate your earlier explanation. I may be able to polish this code/comments (with your help) further, but I cannot find the 28c3d4d commit you mentioned above. Perhaps you have not pushed it to GitHub?
src/client_side_request.cc
Outdated
auto srvBump = getConn()->serverBump(); | ||
// XXX: The following "if" works however the bumping step is not updated | ||
// correctly by peek-and-splice code. | ||
if (srvBump && srvBump->at(XactionStep::tlsBump2)) { | ||
// Update request object | ||
srvBump->request = request; | ||
srvBump->act.step2 = sslBumpNeed_; | ||
CallJobHere(85, 4, getConn(), ConnStateData, resumePeekAndSpliceStep2); | ||
return; | ||
} | ||
|
||
if (untouchedConnect && sslBumpNeeded()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can swap the first if
with the second if
, please swap them. It seems odd to check for the second step before checking whether the first step is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not do this change. If it is important I will try to change it however.
IF I change it the code will look something like the following:
if ((!srvBump && untouchedConnect && sslBumpNeeded() )|| (srvBump && srvBump->at(XactionStep::tlsBump1))) {
...
} else if (srvBump && srvBump->at(XactionStep::tlsBump2) {
...
}
If we are at bumping step1, normally we have not create yet the srvBump object (this is done later by client_side.cc code ), or we have create one just to serve an error (eg an HTTP deny).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IF I change it, the code will look [bad because] If we are at bumping step1, normally we have not create yet the srvBump object (this is done later by client_side.cc code ), or we have create one just to serve an error (eg an HTTP deny).
if ((!srvBump && untouchedConnect && sslBumpNeeded() ) ||
(srvBump && srvBump->at(XactionStep::tlsBump1))) {
...
sslBumpStart();
return;
}
if (srvBump && srvBump->at(XactionStep::tlsBump2)) {
Would it be OK to create the srvBump object as soon as we know that we may need to evaluate ssl_bump? It feels like there are many code places where you have to keep reminding me that srvBump may not exist in some special cases and, in those special cases, we need to checks something else to understand what is going on. Can we create srvBump in all those special cases instead?
After creating srvBump object ASAP, we can probably make sslBumpNeeded() return false when srvBump has been created already. AFAICT, sslBumpNeeded() was meant to determine future need for SslBump. If we already created srvBump, then srvBump should be consulted to determine the current need.
At the end, I would not be surprised if sslBumpNeed_ would be completely gone because we would just create srvBump instead of (remembering that we should create it and using sslBumpNeed_ to remember that future need). Creation of srvBump will signify the beginning of step1.
The code would then look like this:
if (untouchedConnect && srvBump && srvBump->at(XactionStep::tlsBump1)) {
...
sslBumpStart();
return;
}
if (srvBump && srvBump->at(XactionStep::tlsBump2)) {
The untouchedConnect
part still looks a bit odd, but we can deal with that later, I guess.
N.B. We can rename "server" in serverBump()/etc. if needed (so that it is not specific to "server" bumping), but that is a minor naming discrepancy thing that we can do later/separately. I am a lot more worried about all the if
statements that are very difficult to interpret correctly when nil serverBump() may mean something other than "we are not doing SslBump in any shape or form".
src/client_side_request.cc
Outdated
// XXX: The following "if" works however the bumping step is not updated | ||
// correctly by peek-and-splice code. | ||
if (srvBump && srvBump->at(XactionStep::tlsBump2)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a GitHub comment, please detail or paraphrase what that XXX is trying to warn us about, please. I am trying to understand whether the step is wrong when we get here or should be updated inside the if
statement but is not. Also, please discuss why are we not fixing this XXX in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/client_side_request.cc
Outdated
// set final error but delay sending until we bump | ||
Ssl::ServerBump *srvBump = new Ssl::ServerBump(this, e, Ssl::bumpClientFirst); | ||
if (Ssl::ServerBump *srvBump = getConn()->serverBump()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this if
statement can be moved outside and before the sslBumpNeeded() check, please do so. It feels a bit odd and more fragile to first check for the need to bump (i.e. for something that still needs to be done) and only then to check whether we have bumped already (i.e. something that has happened in the past).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit 2c9e47c
Maybe not exactly what you want, but I hope it looks better.
storeUnregister(sc, entry, this); | ||
entry->unlock("Ssl::ServerBump"); | ||
} | ||
entry = e; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, please add:
entry = e; | |
assert(e); | |
entry = e; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit 35abb86
src/ssl/ServerBump.cc
Outdated
entry->unlock("Ssl::ServerBump"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this will not affect the way the code works today, but this code may disengage from the entry-setting code below it. We should not leave dangling pointers, especially when the same member or variable is reused later.
entry->unlock("Ssl::ServerBump"); | |
} | |
entry->unlock("Ssl::ServerBump#storeEntryError"); | |
entry = nullptr; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The related code is changed with commit 35abb86
src/ssl/ServerBump.cc
Outdated
if (entry) { | ||
assert(sc); | ||
storeUnregister(sc, entry, this); | ||
entry->unlock("Ssl::ServerBump"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entry->unlock("Ssl::ServerBump"); | |
entry->unlock("Ssl::ServerBump#storeEntryError"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit 82d319e
src/ssl/ServerBump.cc
Outdated
entry->unlock("Ssl::ServerBump"); | ||
} | ||
entry = e; | ||
entry->lock("Ssl::ServerBump"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mark each lock (and each unlock) location in a unique way to avoid guesswork during triage. Squid code already uses the URL-like #
symbol to separate the primary code context from its lower-level location detail IIRC.
entry->lock("Ssl::ServerBump"); | |
entry->lock("Ssl::ServerBump#storeEntryError"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit 82d319e
return false; | ||
} | ||
if (srvBump && srvBump->at(XactionStep::tlsBump2)) { | ||
assert(http->request->method == Http::METHOD_CONNECT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please discuss what this assert() means, and why we must check this condition during step2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we "must" not check.
Looks that I add it to debug bad states, eg where the bumping step is not updated correctly, eg when the ConnStateData->sslServerBump->step remains at step2 after bump completed and the first tunneled request enters this code.
The ClientRequestContext::sslBumpAccessCheck is called for all HTTP requests enters the doCallouts. But if the current ConnStateData object marks bumping step2 means that this is a doCallouts for a CONNECT request at bumping step2.
I did not remove it (yet).
- Currently when client first bumping mode selected at step1 we are not creating an Ssl::ServerBump object for ConnStateData object. But we are creating when we have to serve an error page using the client-first bumping mode. This patch fixes the ConnStateData::switchToHttps to always builds a Ssl::ServerBump mode for all bumping modes including client-first mode. - Inside ConnStateData::switchToHttps fix the check for ConnStateData::sslServerBump. If this member is already set here means that we are serving an error using client-first bumping mode. - Inside Ssl::PeekingPeerConnector replace any check for ConnStateData::sslServerBump existance with must clauses. This is must be always set when the Ssl::PeekingPeerConnector is called.
Add this check inside Ssl::PeekingPeerConnector::initialize and remove any check for request.flags.sslPeek from any other place inside Ssl::PeekingPeerConnector. Recheck: Should the check be moved inside Ssl::PeekingPeerConnector constructor?
- Add the new tlsBumpDone XactionStep to mark as ConnStateData bumping step after the TLS bump is completed because of connection terminate, splice or client-side and server-side TLS connections estalished. This is may allow use of the sslBump steps with at_step acl in other than ssl_bump access lists. - If the bump-server-first is selected at step tlsBump1 then update the ConnStateData bumping step to tlsBump3. - Assure that the PeekingPeerConnector is called always after we entered bumping step tlsBump3
Bumping at step1 (using "bump" as action) must result to client-first bumping step.
... inside ClientHttpRequest::doCallouts
The Ssl::ServerBump::entry member always is not nil The SslServerBump:sc member always is not nil
... inside ConnStateData::getSslContextStart
During step2, our documentation promises a doCallouts() call
(step 2.2) after TLS ClientHello is parsed (step 2.1) and before
ssl_bump is revisited (step 2.3):
However currently in Squid v5 (at least), step 2.3 happens immediately
after step 2.1, breaking essential step2-blocking-with-an-error-message
use cases.
This patch fixes master/v6 code to match the documentation.
This is a Measurement Factory project