Bug 5501: Squid may exit when ACLs decode an invalid URI (#2145)#2369
Open
squidadm wants to merge 1 commit intosquid-cache:v7from
Open
Bug 5501: Squid may exit when ACLs decode an invalid URI (#2145)#2369squidadm wants to merge 1 commit intosquid-cache:v7from
squidadm wants to merge 1 commit intosquid-cache:v7from
Conversation
…#2145) 2025/08/14 09:28:51| FATAL: invalid pct-encoded triplet exception location: Uri.cc(102) Decode The bug affects url_regex and urllogin ACLs. However, not every use of those ACLs results in a FATAL exit. The exact preconditions are unknown. Three pct-encoding (RFC 3986) error handling algorithms were considered: This algorithm is similar to the algorithm used for handling "ACL is used in context without ALE" and similar errors, but there is a significant context difference: Those "without ALE" errors are Squid misconfigurations or bugs! Decoding failures, on the other hand, are caused by request properties outside of admin or Squid control. With this algorithm, a request can easily avoid a "deny urlHasX" rule match by injecting an invalid pct-encoding (e.g., `X%bad`). Such injections may not be practical for URLs of most resources outside of client control because most servers are unlikely to recognize the malformed URL as something useful for the client. As for resources that client does control, a urlHasX ACL cannot be effective for those anyway because the client can change URLs. Algorithm A does not let Squid admins match problematic URLs! With this algorithm, a request can trigger some "allow urlHasY" rule matches by injecting an invalid pct-encoding that looks like Y (e.g., if an "allow" rule looks for the word `good`, a request may contain a `%good` or `%XXgood` sequence). Just like with algorithm A, such injections probably have little practical value, for similar reasons. Algorithm B lets Squid admins match problematic URLs. With this algorithm, a "partially decoded X" is X where invalid pct-encoding sequences (or their parts) are left "as is" while valid pct-encoding triplets are decoded. This is actually a family of similar algorithms because there are multiple ways to define invalid pct-encoding sequence boundaries in certain URLs! For example, `%6Fne%f%6Fo` can be replaced with `one%foo` or `one%f%6Fo`. This additional complexity/uncertainty aggravates the two concerns below. Algorithm B notes apply to algorithm C as well. Algorithm C lets admins match problematic URLs but, again, it requires that admins know exactly how Squid is going to isolate problematic pct-encoding triplets (e.g., skip/leave just `%` byte that starts an invalid pct-encoding sequence or the following two bytes as well). Algorithm C family includes rfc1738_unescape() behavior. That decoding function was used for the two ACLs before commit cbb9bf1 and commit 226394f started to use AnyP::Uri::Decode() added in commit 26256f2. For example, rfc1738_unescape() decodes `%%` as `%` and leaves some other invalid pct-encoding one-, two-, and three-byte sequences in the decoded result. It is unlikely that many admins know exactly what that old decoding does, but they could tune their rules to "work" as they expect for specific cases. Those rules could stop working after the above commits (v7.0.1+) and this change, to their surprise. This change implements Algorithm B: * Unlike Algorithm A, B allows admins to match bad URLs. * Unlike Algorithm C, B does not force admins to guess how Squid mangles a bad URL before matching it. Also updated ACLs documentation to reflect current implementation.
Contributor
|
Section titles in the original commit message were lost while backporting. Please restore them. |
Contributor
No, the Markdown is dropped by Anubis original patch (https://github.com/squidadm/squid/commit/0f39e8caed5e867b9771a9c21e112764f4dbe92f.patch).
I have now manually copied the PR description from the original PR to this one. Please be aware though that this will most likely make the titles actually be erased on merge. Since lines with "#" as first non-whitespace character are treated as comments and ignored by git. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug affects url_regex and urllogin ACLs. However, not every use of
those ACLs results in a FATAL exit. The exact preconditions are unknown.
Three pct-encoding (RFC 3986) error handling algorithms were considered:
Algorithm A: An ACL that cannot decode, mismatches
This algorithm is similar to the algorithm used for handling "ACL is
used in context without ALE" and similar errors, but there is a
significant context difference: Those "without ALE" errors are Squid
misconfigurations or bugs! Decoding failures, on the other hand, are
caused by request properties outside of admin or Squid control.
With this algorithm, a request can easily avoid a "deny urlHasX" rule
match by injecting an invalid pct-encoding (e.g.,
X%bad). Suchinjections may not be practical for URLs of most resources outside of
client control because most servers are unlikely to recognize the
malformed URL as something useful for the client. As for resources that
client does control, a urlHasX ACL cannot be effective for those anyway
because the client can change URLs.
Algorithm A does not let Squid admins match problematic URLs!
Algorithm B: An ACL that cannot decode X, tests raw/encoded X
With this algorithm, a request can trigger some "allow urlHasY" rule
matches by injecting an invalid pct-encoding that looks like Y (e.g., if
an "allow" rule looks for the word
good, a request may contain a%goodor%XXgoodsequence). Just like with algorithm A, suchinjections probably have little practical value, for similar reasons.
Algorithm B lets Squid admins match problematic URLs.
Algorithm C: An ACL that cannot decode X, tests partially decoded X
With this algorithm, a "partially decoded X" is X where invalid
pct-encoding sequences (or their parts) are left "as is" while valid
pct-encoding triplets are decoded. This is actually a family of similar
algorithms because there are multiple ways to define invalid
pct-encoding sequence boundaries in certain URLs! For example,
%6Fne%f%6Focan be replaced withone%fooorone%f%6Fo. Thisadditional complexity/uncertainty aggravates the two concerns below.
Algorithm B notes apply to algorithm C as well.
Algorithm C lets admins match problematic URLs but, again, it requires
that admins know exactly how Squid is going to isolate problematic
pct-encoding triplets (e.g., skip/leave just
%byte that starts aninvalid pct-encoding sequence or the following two bytes as well).
Algorithm C family includes rfc1738_unescape() behavior. That decoding
function was used for the two ACLs before commit cbb9bf1 and commit
226394f started to use AnyP::Uri::Decode() added in commit 26256f2.
For example, rfc1738_unescape() decodes
%%as%and leaves someother invalid pct-encoding one-, two-, and three-byte sequences in the
decoded result. It is unlikely that many admins know exactly what that
old decoding does, but they could tune their rules to "work" as they
expect for specific cases. Those rules could stop working after the
above commits (v7.0.1+) and this change, to their surprise.
This change implements Algorithm B:
mangles a bad URL before matching it.
Also updated ACLs documentation to reflect current implementation.