Bug 5501: Squid may exit when ACLs decode an invalid URI (#2145) by squidadm · Pull Request #2369 · squid-cache/squid

squidadm · 2026-02-06T03:14:20Z

2025/08/14 09:28:51| FATAL: invalid pct-encoded triplet
    exception location: Uri.cc(102) Decode

The bug affects url_regex and urllogin ACLs. However, not every use of
those ACLs results in a FATAL exit. The exact preconditions are unknown.

Three pct-encoding (RFC 3986) error handling algorithms were considered:

Algorithm A: An ACL that cannot decode, mismatches

This algorithm is similar to the algorithm used for handling "ACL is
used in context without ALE" and similar errors, but there is a
significant context difference: Those "without ALE" errors are Squid
misconfigurations or bugs! Decoding failures, on the other hand, are
caused by request properties outside of admin or Squid control.

With this algorithm, a request can easily avoid a "deny urlHasX" rule
match by injecting an invalid pct-encoding (e.g., X%bad). Such
injections may not be practical for URLs of most resources outside of
client control because most servers are unlikely to recognize the
malformed URL as something useful for the client. As for resources that
client does control, a urlHasX ACL cannot be effective for those anyway
because the client can change URLs.

Algorithm A does not let Squid admins match problematic URLs!

Algorithm B: An ACL that cannot decode X, tests raw/encoded X

With this algorithm, a request can trigger some "allow urlHasY" rule
matches by injecting an invalid pct-encoding that looks like Y (e.g., if
an "allow" rule looks for the word good, a request may contain a
%good or %XXgood sequence). Just like with algorithm A, such
injections probably have little practical value, for similar reasons.

Algorithm B lets Squid admins match problematic URLs.

Algorithm C: An ACL that cannot decode X, tests partially decoded X

With this algorithm, a "partially decoded X" is X where invalid
pct-encoding sequences (or their parts) are left "as is" while valid
pct-encoding triplets are decoded. This is actually a family of similar
algorithms because there are multiple ways to define invalid
pct-encoding sequence boundaries in certain URLs! For example,
%6Fne%f%6Fo can be replaced with one%foo or one%f%6Fo. This
additional complexity/uncertainty aggravates the two concerns below.

Algorithm B notes apply to algorithm C as well.

Algorithm C lets admins match problematic URLs but, again, it requires
that admins know exactly how Squid is going to isolate problematic
pct-encoding triplets (e.g., skip/leave just % byte that starts an
invalid pct-encoding sequence or the following two bytes as well).

Algorithm C family includes rfc1738_unescape() behavior. That decoding
function was used for the two ACLs before commit cbb9bf1 and commit
226394f started to use AnyP::Uri::Decode() added in commit 26256f2.
For example, rfc1738_unescape() decodes %% as % and leaves some
other invalid pct-encoding one-, two-, and three-byte sequences in the
decoded result. It is unlikely that many admins know exactly what that
old decoding does, but they could tune their rules to "work" as they
expect for specific cases. Those rules could stop working after the
above commits (v7.0.1+) and this change, to their surprise.

This change implements Algorithm B:

Unlike Algorithm A, B allows admins to match bad URLs.
Unlike Algorithm C, B does not force admins to guess how Squid
mangles a bad URL before matching it.

Also updated ACLs documentation to reflect current implementation.

…#2145) 2025/08/14 09:28:51| FATAL: invalid pct-encoded triplet exception location: Uri.cc(102) Decode The bug affects url_regex and urllogin ACLs. However, not every use of those ACLs results in a FATAL exit. The exact preconditions are unknown. Three pct-encoding (RFC 3986) error handling algorithms were considered: This algorithm is similar to the algorithm used for handling "ACL is used in context without ALE" and similar errors, but there is a significant context difference: Those "without ALE" errors are Squid misconfigurations or bugs! Decoding failures, on the other hand, are caused by request properties outside of admin or Squid control. With this algorithm, a request can easily avoid a "deny urlHasX" rule match by injecting an invalid pct-encoding (e.g., `X%bad`). Such injections may not be practical for URLs of most resources outside of client control because most servers are unlikely to recognize the malformed URL as something useful for the client. As for resources that client does control, a urlHasX ACL cannot be effective for those anyway because the client can change URLs. Algorithm A does not let Squid admins match problematic URLs! With this algorithm, a request can trigger some "allow urlHasY" rule matches by injecting an invalid pct-encoding that looks like Y (e.g., if an "allow" rule looks for the word `good`, a request may contain a `%good` or `%XXgood` sequence). Just like with algorithm A, such injections probably have little practical value, for similar reasons. Algorithm B lets Squid admins match problematic URLs. With this algorithm, a "partially decoded X" is X where invalid pct-encoding sequences (or their parts) are left "as is" while valid pct-encoding triplets are decoded. This is actually a family of similar algorithms because there are multiple ways to define invalid pct-encoding sequence boundaries in certain URLs! For example, `%6Fne%f%6Fo` can be replaced with `one%foo` or `one%f%6Fo`. This additional complexity/uncertainty aggravates the two concerns below. Algorithm B notes apply to algorithm C as well. Algorithm C lets admins match problematic URLs but, again, it requires that admins know exactly how Squid is going to isolate problematic pct-encoding triplets (e.g., skip/leave just `%` byte that starts an invalid pct-encoding sequence or the following two bytes as well). Algorithm C family includes rfc1738_unescape() behavior. That decoding function was used for the two ACLs before commit cbb9bf1 and commit 226394f started to use AnyP::Uri::Decode() added in commit 26256f2. For example, rfc1738_unescape() decodes `%%` as `%` and leaves some other invalid pct-encoding one-, two-, and three-byte sequences in the decoded result. It is unlikely that many admins know exactly what that old decoding does, but they could tune their rules to "work" as they expect for specific cases. Those rules could stop working after the above commits (v7.0.1+) and this change, to their surprise. This change implements Algorithm B: * Unlike Algorithm A, B allows admins to match bad URLs. * Unlike Algorithm C, B does not force admins to guess how Squid mangles a bad URL before matching it. Also updated ACLs documentation to reflect current implementation.

rousskov · 2026-02-06T03:37:53Z

Section titles in the original commit message were lost while backporting. Please restore them.

yadij · 2026-02-06T06:12:42Z

Section titles in the original commit message were lost while backporting.

No, the Markdown is dropped by Anubis original patch (https://github.com/squidadm/squid/commit/0f39e8caed5e867b9771a9c21e112764f4dbe92f.patch).

Please restore them.

I have now manually copied the PR description from the original PR to this one.

Please be aware though that this will most likely make the titles actually be erased on merge. Since lines with "#" as first non-whitespace character are treated as comments and ignored by git.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 5501: Squid may exit when ACLs decode an invalid URI (#2145)#2369

Bug 5501: Squid may exit when ACLs decode an invalid URI (#2145)#2369
squidadm wants to merge 1 commit intosquid-cache:v7from
squidadm:v7-backport-pr2145

squidadm commented Feb 6, 2026 •

edited by yadij

Loading

Uh oh!

rousskov commented Feb 6, 2026

Uh oh!

yadij commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

squidadm commented Feb 6, 2026 • edited by yadij Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Algorithm A: An ACL that cannot decode, mismatches

Algorithm B: An ACL that cannot decode X, tests raw/encoded X

Algorithm C: An ACL that cannot decode X, tests partially decoded X

Uh oh!

rousskov commented Feb 6, 2026

Uh oh!

yadij commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

squidadm commented Feb 6, 2026 •

edited by yadij

Loading