Skip to content

containers: Add experimental support for interceptOutboundHttps#6263

Draft
gabivlj wants to merge 1 commit intomainfrom
gv/tls
Draft

containers: Add experimental support for interceptOutboundHttps#6263
gabivlj wants to merge 1 commit intomainfrom
gv/tls

Conversation

@gabivlj
Copy link
Contributor

@gabivlj gabivlj commented Mar 5, 2026

interceptOutboundHttps is a way for users to intercept their own TLS traffic.

The way this works is different from interceptOutboundHttp. In the first one, we can decide which IP and port combinations should intercept HTTP traffic, but HTTPS is more idiomatic to handle at the SNI level.

The reasoning behind this is that customers that want to intercept TLS might want to only be triggering on certain SNIs being intercepted.

We do support currently other ports than 443 but that might change in the future by extending the method to accept a port with the SNI. It's just the use-case is clear to be SNI based only.

The glob format of the SNI that we accept is really simple, only '*' and the domain (to support cases like *google.com and all its subdomains). No plans on supporting regex here whatsoever.

The way local dev works is we generate the certificates in the networking sidecar, we read it via exec, and write them to the container to a known path
(/etc/cloudflare/certs/cloudflare-containers-ca.crt).

We also try to append the certificate to
known distro paths, but this might be a more controversial move we can discuss in the MR.

The flow of the connection is:

[container] --> [proxy-everything] (tls) -->
[workerd container-client.c++] (processes configured egress policies) ->
[workerd subrequest channel]

The only way to make files being written consistently across distros is by using the Docker /archive API. It can only accept a tar right now, so we had to add a method that creates a simple tar file that contains a single file that we want to add to the container (the CA).


if (req.url === '/intercept-https') {
const targetHost = req.headers['x-host'] || 'example.com';
fetch(`https://${targetHost}`)

Check failure

Code scanning / CodeQL

Server-side request forgery Critical

The
URL
of this request depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, to fix SSRF here we must avoid using arbitrary user input as the hostname for outgoing requests. Instead, we should map user-controlled identifiers to a small, server-controlled allow-list of hostnames (or disallow internal IPs/hostnames explicitly). This ensures that even if the client supplies a malicious x-host, the actual hostname used in fetch is one of a limited set that we trust.

The least intrusive fix that preserves existing functionality is to treat x-host not as a raw hostname, but as a key selecting from a predefined set of allowed HTTPS hosts. For example, we can define a small mapping such as { default: 'example.com', docs: 'developer.mozilla.org' }. The header can specify which logical target to use (e.g., x-host: docs), and the code looks up the corresponding hostname. If the provided key is missing or invalid, we fall back to the default host. This removes direct control of the hostname from the client while still allowing selection among known test targets. Concretely, in /intercept-https we will:

  • Introduce a constant ALLOWED_HTTPS_HOSTS near the top of the file.
  • Replace const targetHost = req.headers['x-host'] || 'example.com'; with logic that:
    • Reads the header as a key (e.g., targetKey).
    • Looks up const targetHost = ALLOWED_HTTPS_HOSTS[targetKey] || ALLOWED_HTTPS_HOSTS.default;.
  • Leave the fetch(\https://${targetHost}`)call unchanged, now thattargetHost` is server-controlled.
  • Keep the error handling and response behavior the same, except that the potentially attacker-controlled key is no longer directly used as the hostname.

This change is entirely within images/container-client-test/app.js, adds only a small constant, and does not require external libraries.

Suggested changeset 1
images/container-client-test/app.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/images/container-client-test/app.js b/images/container-client-test/app.js
--- a/images/container-client-test/app.js
+++ b/images/container-client-test/app.js
@@ -4,6 +4,13 @@
 const wsProxyTarget = process.env.WS_PROXY_TARGET || null;
 const wsProxySecure = process.env.WS_PROXY_SECURE === 'true';
 
+// Allow-list of HTTPS hosts that can be contacted via the /intercept-https endpoint.
+// The client supplies a logical key in the x-host header, not a raw hostname.
+const ALLOWED_HTTPS_HOSTS = {
+  default: 'example.com',
+  // Additional logical targets can be added here if needed, for example:
+  // docs: 'developer.mozilla.org',
+};
 const server = createServer(function (req, res) {
   if (req.url === '/ws') {
     return;
@@ -57,7 +64,10 @@
   }
 
   if (req.url === '/intercept-https') {
-    const targetHost = req.headers['x-host'] || 'example.com';
+    const targetKey = req.headers['x-host'];
+    const targetHost =
+      (typeof targetKey === 'string' && ALLOWED_HTTPS_HOSTS[targetKey]) ||
+      ALLOWED_HTTPS_HOSTS.default;
     fetch(`https://${targetHost}`)
       .then((result) => result.text())
       .then((body) => {
EOF
@@ -4,6 +4,13 @@
const wsProxyTarget = process.env.WS_PROXY_TARGET || null;
const wsProxySecure = process.env.WS_PROXY_SECURE === 'true';

// Allow-list of HTTPS hosts that can be contacted via the /intercept-https endpoint.
// The client supplies a logical key in the x-host header, not a raw hostname.
const ALLOWED_HTTPS_HOSTS = {
default: 'example.com',
// Additional logical targets can be added here if needed, for example:
// docs: 'developer.mozilla.org',
};
const server = createServer(function (req, res) {
if (req.url === '/ws') {
return;
@@ -57,7 +64,10 @@
}

if (req.url === '/intercept-https') {
const targetHost = req.headers['x-host'] || 'example.com';
const targetKey = req.headers['x-host'];
const targetHost =
(typeof targetKey === 'string' && ALLOWED_HTTPS_HOSTS[targetKey]) ||
ALLOWED_HTTPS_HOSTS.default;
fetch(`https://${targetHost}`)
.then((result) => result.text())
.then((body) => {
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Bonk, and I've done a quick review of your PR.

This PR adds interceptOutboundHttps — SNI-based TLS interception for container egress, with CA cert injection into user containers. Overall the design is sound, and the protocol between sidecar and workerd (200 for MITM, 202 for passthrough) is clean.

Issues found (highest severity first):

  1. --tls-intercept unconditionally passed to sidecar — even when only HTTP mappings exist. May cause the sidecar to generate an unnecessary CA cert and set up MITM infrastructure when only interceptOutboundHttp was used.
  2. KJ_ON_SCOPE_FAILURE in a coroutine won't fire on cancellation — if injectCACert()'s promise is dropped mid-co_await, the flag stays true forever, blocking retries. Should use KJ_DEFER instead.
  3. matchSniGlob matches bare hostname against *.example.comexample.com is correctly rejected but xexample.com would incorrectly match since the suffix match only checks .example.com (the x fills the "hostname longer than suffix" check). The suffix-only check doesn't verify a dot boundary in the hostname.
  4. Extra blank line in start() — trivial nit, but mentioned since it may indicate accidental edit.

if (!ipv6Enabled) {
cmd.set(idx++, "--disable-ipv6");
}
cmd.set(idx++, "--tls-intercept");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: --tls-intercept is unconditionally passed to the sidecar, even when the user only called interceptOutboundHttp() (no HTTPS mappings). This means the sidecar always generates a CA cert and sets up TLS MITM infrastructure, even if TLS interception was never requested.

Consider making this conditional:

Suggested change
cmd.set(idx++, "--tls-intercept");
if (!containerClient.egressHttpsMappings.empty()) {
cmd.set(idx++, "--tls-intercept");
}

(This would also require making cmdSize dynamic based on whether HTTPS mappings exist, or using a kj::Vector approach instead of a fixed-size list.)

Alternatively, if the sidecar's --tls-intercept is cheap when no SNI mappings are actually matched, this could be intentional — but a comment explaining that would be helpful.

co_return;
}

KJ_ON_SCOPE_FAILURE(caCertInjected.store(false, std::memory_order_release));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: KJ_ON_SCOPE_FAILURE does not fire on coroutine cancellation. If the kj::Promise<void> returned by injectCACert() is dropped while suspended at one of the co_await points below (lines 801, 804, 814, 826), the coroutine frame is destroyed without an active exception. The scope guard never fires, so caCertInjected stays true permanently, blocking all future injection attempts.

Consider using KJ_DEFER with an explicit success flag instead:

Suggested change
KJ_ON_SCOPE_FAILURE(caCertInjected.store(false, std::memory_order_release));
bool succeeded = false;
KJ_DEFER(if (!succeeded) caCertInjected.store(false, std::memory_order_release));

Then set succeeded = true at the end of the function (before the final co_return or after the last co_await).

// "*.example.com" should match "foo.example.com" but not "example.com"
auto suffix = glob.slice(1); // ".example.com"
if (hostname.size() <= suffix.size()) return false;
return asciiCaseInsensitiveEquals(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: suffix match lacks dot-boundary check in hostname. With glob *.example.com, the suffix is .example.com. The check hostname.size() > suffix.size() ensures at least one character before the suffix, but doesn't verify that the matching portion in hostname ends at a label boundary.

For example, notexample.com would not match (correct, since notexample.com has length 15 vs .example.com length 12, and the suffix of notexample.com is xample.com which doesn't match .example.com).

Actually wait — I re-checked: the suffix starts with ., so hostname.slice(hostname.size() - suffix.size()) for notexample.com gives .example.com which does equal the suffix. So *.example.com would match notexample.com. This is likely unintended.

To fix, verify that the character just before the suffix in the hostname is indeed matched by the * and that the suffix alignment is correct. Since the suffix starts with ., this should already create a dot boundary — but only if the hostname also has a dot at that position. For notexample.com (length 15), slicing from position 3 gives example.com — wait, let me recount.

notexample.com = 15 chars, .example.com suffix = 12 chars. hostname.slice(15 - 12) = hostname.slice(3) = example.com. That does NOT start with ., so asciiCaseInsensitiveEquals("example.com", ".example.com") returns false (different lengths? No — both are 12... wait, example.com is 11 chars, .example.com is 12 chars). OK, slice(3) from a 15-char string gives 12 chars: e, x, a, m, p, l, e, ., c, o, m — that's 11 chars. Actually no, slice(3) from length 15 gives length 12.

notexample.com:

n o t e x a m p l e . c o m
0 1 2 3 4 5 6 7 8 9 10 11 12 13

That's 14 chars. .example.com is 12 chars. slice(14-12) = slice(2) = texample.com (12 chars). asciiCaseInsensitiveEquals("texample.com", ".example.com") → false (first chars t vs .). OK so this specific case is actually fine — the . in the suffix provides a natural boundary.

I apologize for the false alarm — the suffix starting with . does effectively enforce a dot boundary. The <= size check at line 187 correctly rejects exact matches like example.com against *.example.com. This logic is correct.

Comment on lines +1203 to +1204


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra blank line.

Suggested change

headers.setPtr(kj::HttpHeaderId::CONTENT_TYPE, "application/x-tar");
headers.set(kj::HttpHeaderId::CONTENT_LENGTH, kj::str(tar.size()));

auto endpoint = kj::str("/containers/", container, "/archive?path=", dir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: dir is not URL-encoded in the query string. Currently all call sites pass hardcoded safe paths (/etc, /etc/ssl/certs, etc.), so there is no bug today. But writeFileToContainer is a general helper — if it's ever called with a path containing #, &, =, or spaces, the Docker API request will be malformed.

Consider using kj::encodeUriComponent on dir:

Suggested change
auto endpoint = kj::str("/containers/", container, "/archive?path=", dir);
auto endpoint = kj::str("/containers/", container, "/archive?path=", kj::encodeUriComponent(dir));

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Mar 5, 2026

Review submitted on PR #6263. I noted the self-correction on issue #3 (matchSniGlob) directly in the comment — the suffix starting with . does naturally enforce a dot boundary, so that one turned out to be a false alarm. The two actionable issues are:

  1. --tls-intercept always passed to sidecar (medium) — wasteful when only HTTP egress is configured
  2. KJ_ON_SCOPE_FAILURE in coroutine (medium) — won't reset caCertInjected on promise cancellation; should use KJ_DEFER with a success flag instead

Plus one low-severity defensive improvement (URL-encoding dir in writeFileToContainer) and a trivial extra blank line.

github run

@github-actions
Copy link

github-actions bot commented Mar 5, 2026

The generated output of @cloudflare/workers-types matches the snapshot in types/generated-snapshot 🎉

`interceptOutboundHttps` is a way for users to intercept their own TLS
traffic.

The way this works is different from interceptOutboundHttp. In the first
one, we can decide which IP and port combinations should intercept HTTP
traffic, but HTTPS is more idiomatic to handle at the SNI level.

The reasoning behind this is that customers that want to intercept
TLS might want to only be triggering on certain SNIs being intercepted.

We do support currently other ports than 443 but that might change in the future by
extending the method to accept a port with the SNI. It's just the
use-case is clear to be SNI based only.

The glob format of the SNI that we accept is really simple, only '*' and the
domain (to support cases like *google.com and all its subdomains).
No plans on supporting regex here whatsoever.

The way local dev works is we generate the certificates in the
networking sidecar, we read it via exec, and write them to the
container to a known path
(/etc/cloudflare/certs/cloudflare-containers-ca.crt).

We could try to append the certificate to
known distro paths, but that might be a more controversial move, we can
discuss in the MR if it's worth doing.

The flow of the connection is:

```
[container] --> [proxy-everything] (tls) -->
[workerd container-client.c++] (processes configured egress policies) ->
[workerd subrequest channel]
```

The only way to make files being written consistently across distros is
by using the Docker /archive API. It can only accept a tar right now, so
we had to add a method that creates a simple tar file that contains a
single file that we want to add to the container (the CA).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant