Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doCallouts before checking ssl_bump step2 rules #108

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

chtsanti
Copy link

During step2, our documentation promises a doCallouts() call
(step 2.2) after TLS ClientHello is parsed (step 2.1) and before
ssl_bump is revisited (step 2.3):

2.1: Get TLS Client Hello info from the client, including SNI
       where available. Adjust the CONNECT request from step1 to
       reflect SNI.

 2.2: Go through the Callout Sequence with the adjusted CONNECT
         request mentioned above.

 2.3: Evaluate again all ssl_bump rules and perform the first 
        matching action (splice, bump, peek, stare, or terminate).

However currently in Squid v5 (at least), step 2.3 happens immediately
after step 2.1, breaking essential step2-blocking-with-an-error-message
use cases.

This patch fixes master/v6 code to match the documentation.

This is a Measurement Factory project

The documentation says that for step1 we must follow the rules:
 - Get TLS Client Hello info from the client, including SNI where
   available. Adjust the CONNECT request from step1 to reflect SNI.

  - Go through the Callout Sequence with the adjusted CONNECT request
    mentioned above.

  - Evaluate again all ssl_bump rules and perform the first matching
    action (splice, bump, peek, stare, or terminate).

The callouts on step2 are never called. This patch fixes this.
Use a CallJob to call ConnStateData::resumePeekAndSpliceStep2, the
http object may destructed during this call causing segfauts to squid
This is may interpreted as a Callouts call after ssl_bump step2 rules
are checked.

This commit may reverted if we decide that this fake CONNECT request is a
completelly new request and should not confused with bumped CONNECT
request.
Currently this check is done by ConnStateData class. This patch
removes the related ConnStateData code and allow DoCallouts run
the ssl_bump access checks for tls bumping step step2
... at tls bumping step1.
Documentation says:
   i. Get TCP-level info from the client.
      * In forward proxy environments, also parse the CONNECT request.
      * In interception environments, create a fake CONNECT request
        using TCP-level info.
  ii. Go through the Callout Sequence with the CONNECT request
      mentioned above.
 iii. Evaluate all ssl_bump rules and perform the first matching
      action (splice, bump, peek, stare, or terminate).

But for intercepted environments the order was:
   - evaluate ssl_bump rules
   - create fake CONNECT request
   - Go through callout sequency
Squid documentation says about step2:
 - Get TLS Client Hello info from the client, including SNI
    where available. Adjust the CONNECT request from step1
    to reflect SNI.

 - Go through the Callout Sequence with the adjusted CONNECT
   request mentioned above.

 - Evaluate again all ssl_bump rules and perform the first matching
   action (splice, bump, peek, stare, or terminate).
    * Peeking at this step usually makes bumping at step 3 impossible.
    * Staring at this step usually makes splicing at step 3 impossible.

However current implementation:
  - Does not adjust CONNECT request from step1 retrieved info (SNI).
  - The fix/commit "TLS bumping fix: Perform a second doCallouts when
    step2 runs the Callouts for step2 but this is not done with a
    safe way: the calouts called twice for each ClientHttpRequest
    object. This is not the way the ClientHttpRequest is designed.

This patch at step2 builds a new ClientHttpRequest object and run
the callouts for this object. It uses the same ClientHttpRequest
object to splice if required.
and initiateTunneledRequest ConnStateData methods

It is needed only in one place. For this unique case set the
ConnStateData::inBuf member directly.
…uest

The commit "Fix ssl bumping step2 to follow ..." builds a new request
with the adjusted SNI info, but all the other properties (headers, auth
info, notes etc) from step1 request are lost.
This patch instead of building a new HttpRequest object and try set its
parameters, clone the old one and replace the host name in URI and host
header, if the SNI info exist and if the CONNECT string has a IP address
and not a DNS name.
This is must be enough.

Changes:
  - Add a new ConnStateData::buildFakeRequest variant which takes as argument
    an HttpRequest object and builds a new ClientHttpRequest object.
  - Now the ConnStateData::startPeekAndSpliceStep2 instead of calling
    the fakeAConnectRequest or initiateTunneledRequest methods
    clones the original HttpRequest object, adjust it and use
    the new ConnStateData::buildFakeRequest method
Current code while the certificates are generated checks if peek and
splice is enabled at step1 to decide if should generate SSL/CTX objects
or configure already generated SSL/CTX objects (inside
ConnStateData::sslCrtdHandleReply and ConnStateData::getSslContextStart
methods). This is wrong because we may have decide to peek/stare
at step1 but the procedure break before we generate the SSLL/CTX
objects. This branch tried to fix this wrongly by checking if
we are enter the Step3 too.
This patch just checks if we have already generate SSL/CTX objects
and if yes reconfigure them with the generated certificates else
generate new SSL/CTX objects based on generated certificates.

Also this patch renames (and slightly fixes) the ConnStateData
doPeekAndSpliceStep() method to restartTlsNegotiation().
Actually this is what it does.
Currently client blocks waiting for ever squid response and
squid does not process the readed HttpRequest to serve the error
page.
Copy link

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few small change requests, but the primary purpose of this review is to ask a few questions about the proposed changes.

Comment on lines 2746 to 2750
// Disable caching if the SSL object exists and has attached a CTX object
if (!(fd_table[clientConnection->fd].ssl)) {
sslBumpCertKey.clear();
Ssl::InRamCertificateDbKey(certProperties, sslBumpCertKey);
assert(!sslBumpCertKey.isEmpty());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Disable caching if the SSL object exists and has attached a CTX object

GitHub does not let me quote more code below these lines, but I see nothing about disabling HTTP(??) caching in those lines. The literal description of the two combined if statements guards feels unhelpful at best. A developer can read the code and see that the description matches the code, but still have no clue what these conditions actually mean (i.e. when do we want to disable caching and why only under those conditions).

Please clarify:

  1. What kind of caching are we disabling here? HTTP, TLS sessions, other? Which call inside the if statement will disable that caching? Is the primary purpose of that call to disable caching?

  2. What do we think the if statement conditions mean? The old comment was implying it means we are bumping the client connection (or something like that), but you have removed that so I assume it was wrong.

The old comment was not clear either so I welcome the opportunity to clarify what is going on here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What kind of caching are we disabling here? HTTP, TLS sessions, other? Which call inside the if statement will disable that caching? Is the primary purpose of that call to disable caching?

Instead of using a generated certificates cache, we are using a SSL_CTX objects cache.

In client-first and server-first bumping mode, when we are entering this code we have not generate yet the OpenSSL required structures to initiate TLS conversation. The required objects are the SSL_CTX (which is also holds server certificate) and then the SSL object (which is built based on SSL_CTX object). In this case we are getting from cache a SSL_CTX object and we are building the SSL object based on this SSL_CTX object

In Peek/Stare modes we are creating SSL_CTX/SSL objects in very early steps and although there is the SSL_set_SSL_CTX method which allow you to set an SSL_CTX object to an SSL object we have consider this procedure unsafe so we decided to regenerate certificates (or get it from helpers cache) and attach the generated certificates to the connection existing (and empty/unconfigured) SSL_CTX object.

(I think it worths to recheck if we can use a cached SSL_CTX object from cache for stare modes.)

  1. What do we think the if statement conditions mean? The old comment was implying it means we are bumping the client connection (or something like that), but you have removed that so I assume it was wrong.

I changed the old comment in my commit 521148e. This is because I realized that checks for step1 decisions are not safe and it is better to just check if the SSL/SSL_CTX objects are configured.

I think the correct comment should be something like the following:
// Do not use SSL_CTX objects cache if the SSL/SSL_CTX objects pair for this connection are built

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please recheck my commit 28c3d4d for the updated comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please recheck my commit 28c3d4d for the updated comment.

I appreciate your earlier explanation. I may be able to polish this code/comments (with your help) further, but I cannot find the 28c3d4d commit you mentioned above. Perhaps you have not pushed it to GitHub?

src/client_side.cc Outdated Show resolved Hide resolved
Comment on lines 1522 to 1533
auto srvBump = getConn()->serverBump();
// XXX: The following "if" works however the bumping step is not updated
// correctly by peek-and-splice code.
if (srvBump && srvBump->at(XactionStep::tlsBump2)) {
// Update request object
srvBump->request = request;
srvBump->act.step2 = sslBumpNeed_;
CallJobHere(85, 4, getConn(), ConnStateData, resumePeekAndSpliceStep2);
return;
}

if (untouchedConnect && sslBumpNeeded()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can swap the first if with the second if, please swap them. It seems odd to check for the second step before checking whether the first step is needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not do this change. If it is important I will try to change it however.
IF I change it the code will look something like the following:

if ((!srvBump  && untouchedConnect && sslBumpNeeded() )|| (srvBump && srvBump->at(XactionStep::tlsBump1))) {
...
} else if (srvBump && srvBump->at(XactionStep::tlsBump2) {
...
}

If we are at bumping step1, normally we have not create yet the srvBump object (this is done later by client_side.cc code ), or we have create one just to serve an error (eg an HTTP deny).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IF I change it, the code will look [bad because] If we are at bumping step1, normally we have not create yet the srvBump object (this is done later by client_side.cc code ), or we have create one just to serve an error (eg an HTTP deny).

if ((!srvBump  && untouchedConnect && sslBumpNeeded() ) ||
    (srvBump && srvBump->at(XactionStep::tlsBump1))) {
    ...
    sslBumpStart();
    return;
}

if (srvBump && srvBump->at(XactionStep::tlsBump2)) {

Would it be OK to create the srvBump object as soon as we know that we may need to evaluate ssl_bump? It feels like there are many code places where you have to keep reminding me that srvBump may not exist in some special cases and, in those special cases, we need to checks something else to understand what is going on. Can we create srvBump in all those special cases instead?

After creating srvBump object ASAP, we can probably make sslBumpNeeded() return false when srvBump has been created already. AFAICT, sslBumpNeeded() was meant to determine future need for SslBump. If we already created srvBump, then srvBump should be consulted to determine the current need.

At the end, I would not be surprised if sslBumpNeed_ would be completely gone because we would just create srvBump instead of (remembering that we should create it and using sslBumpNeed_ to remember that future need). Creation of srvBump will signify the beginning of step1.

The code would then look like this:

if (untouchedConnect && srvBump && srvBump->at(XactionStep::tlsBump1)) {
    ...
    sslBumpStart();
    return;
}

if (srvBump && srvBump->at(XactionStep::tlsBump2)) {

The untouchedConnect part still looks a bit odd, but we can deal with that later, I guess.

N.B. We can rename "server" in serverBump()/etc. if needed (so that it is not specific to "server" bumping), but that is a minor naming discrepancy thing that we can do later/separately. I am a lot more worried about all the if statements that are very difficult to interpret correctly when nil serverBump() may mean something other than "we are not doing SslBump in any shape or form".

Comment on lines 1523 to 1525
// XXX: The following "if" works however the bumping step is not updated
// correctly by peek-and-splice code.
if (srvBump && srvBump->at(XactionStep::tlsBump2)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a GitHub comment, please detail or paraphrase what that XXX is trying to warn us about, please. I am trying to understand whether the step is wrong when we get here or should be updated inside the if statement but is not. Also, please discuss why are we not fixing this XXX in this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit 214ffc8 removes this comment.

Also, please discuss why are we not fixing this XXX in this PR.

I consider it as out of scope.
This fix also forced some cleanup in bumping procedure, the commits ab757cf and ab757cf

// set final error but delay sending until we bump
Ssl::ServerBump *srvBump = new Ssl::ServerBump(this, e, Ssl::bumpClientFirst);
if (Ssl::ServerBump *srvBump = getConn()->serverBump()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this if statement can be moved outside and before the sslBumpNeeded() check, please do so. It feels a bit odd and more fragile to first check for the need to bump (i.e. for something that still needs to be done) and only then to check whether we have bumped already (i.e. something that has happened in the past).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit 2c9e47c

Maybe not exactly what you want, but I hope it looks better.

storeUnregister(sc, entry, this);
entry->unlock("Ssl::ServerBump");
}
entry = e;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, please add:

Suggested change
entry = e;
assert(e);
entry = e;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit 35abb86

Comment on lines 84 to 85
entry->unlock("Ssl::ServerBump");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this will not affect the way the code works today, but this code may disengage from the entry-setting code below it. We should not leave dangling pointers, especially when the same member or variable is reused later.

Suggested change
entry->unlock("Ssl::ServerBump");
}
entry->unlock("Ssl::ServerBump#storeEntryError");
entry = nullptr;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The related code is changed with commit 35abb86

if (entry) {
assert(sc);
storeUnregister(sc, entry, this);
entry->unlock("Ssl::ServerBump");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
entry->unlock("Ssl::ServerBump");
entry->unlock("Ssl::ServerBump#storeEntryError");

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit 82d319e

entry->unlock("Ssl::ServerBump");
}
entry = e;
entry->lock("Ssl::ServerBump");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mark each lock (and each unlock) location in a unique way to avoid guesswork during triage. Squid code already uses the URL-like # symbol to separate the primary code context from its lower-level location detail IIRC.

Suggested change
entry->lock("Ssl::ServerBump");
entry->lock("Ssl::ServerBump#storeEntryError");

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit 82d319e

return false;
}
if (srvBump && srvBump->at(XactionStep::tlsBump2)) {
assert(http->request->method == Http::METHOD_CONNECT);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please discuss what this assert() means, and why we must check this condition during step2.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we "must" not check.
Looks that I add it to debug bad states, eg where the bumping step is not updated correctly, eg when the ConnStateData->sslServerBump->step remains at step2 after bump completed and the first tunneled request enters this code.

The ClientRequestContext::sslBumpAccessCheck is called for all HTTP requests enters the doCallouts. But if the current ConnStateData object marks bumping step2 means that this is a doCallouts for a CONNECT request at bumping step2.

I did not remove it (yet).

 - Currently when client first bumping mode selected at step1
   we are not creating an Ssl::ServerBump object for ConnStateData
   object. But we are creating when we have to serve an error page
   using the client-first bumping mode.
   This patch fixes the ConnStateData::switchToHttps to always builds
   a Ssl::ServerBump mode for all bumping modes including client-first
   mode.

 - Inside ConnStateData::switchToHttps fix the check for
   ConnStateData::sslServerBump. If this member is already set here
   means that we are serving an error using client-first
   bumping mode.

 - Inside Ssl::PeekingPeerConnector replace any check for
   ConnStateData::sslServerBump existance with must clauses. This is
   must be always set when the Ssl::PeekingPeerConnector is called.
Add this check inside Ssl::PeekingPeerConnector::initialize and remove
any check for request.flags.sslPeek from any other place inside
Ssl::PeekingPeerConnector.

Recheck: Should the check be moved inside Ssl::PeekingPeerConnector
constructor?
- Add the new tlsBumpDone XactionStep to mark as ConnStateData bumping step
after the TLS bump is completed because of connection terminate, splice
or client-side and server-side TLS connections estalished.
This is may allow use of the sslBump steps with at_step acl in other
than ssl_bump access lists.

- If the bump-server-first is selected at step tlsBump1 then update the
ConnStateData bumping step to tlsBump3.

- Assure that the PeekingPeerConnector is called always after we entered
bumping step tlsBump3
Bumping at step1 (using "bump" as action) must result to client-first
bumping step.
... inside ClientHttpRequest::doCallouts
The Ssl::ServerBump::entry member always is not nil
The SslServerBump:sc member always is not nil
... inside ConnStateData::getSslContextStart
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants