Skip to content

fix: bypass Squid for network gateway to fix MCP SSE crash#553

Merged
Mossaka merged 2 commits intomainfrom
fix/bypass-squid-for-network-gateway
Feb 6, 2026
Merged

fix: bypass Squid for network gateway to fix MCP SSE crash#553
Mossaka merged 2 commits intomainfrom
fix/bypass-squid-for-network-gateway

Conversation

@Mossaka
Copy link
Collaborator

@Mossaka Mossaka commented Feb 6, 2026

Summary

  • Fixes Squid crash (comm.cc:1583 assertion failure / segfault) when proxying concurrent MCP Streamable HTTP (SSE) connections from Codex to the MCP gateway
  • Adds the container's default network gateway (172.30.0.1) to the iptables bypass list alongside host.docker.internal (172.17.0.1)

Root Cause

Codex resolves host.docker.internal to 172.30.0.1 (AWF network gateway) instead of 172.17.0.1 (Docker bridge). The existing iptables bypass only covers 172.17.0.1, so MCP traffic to 172.30.0.1:80 gets DNAT-redirected to Squid. Concurrent SSE connections through Squid trigger the assertion failure and Squid segfaults, severing all proxied connections.

Local Reproduction

Before fix — Squid crashes:

FATAL: assertion failed: comm.cc:1583: "isOpen(fd) && !commHasHalfClosedMonitor(fd)"
FATAL: Received Segment Violation...dying.

Test C: curl: (7) Failed to connect to 172.30.0.10 port 3128: Connection refused — Squid is dead.

After fix — Squid survives:

[iptables] Allow direct traffic to network gateway (172.30.0.1) - bypassing Squid...
  • Direct traffic to 172.30.0.1 reaches host SSE server (Server: BaseHTTP/0.6 Python)
  • SSE streams work: 10 events received, concurrent SSE+POSTs succeed
  • Squid access.log: zero 172.30.0.1 entries (fully bypassed)
  • Squid still alive after the storm, successfully proxies example.com

Test plan

  • Unit tests pass (732/732 locally)
  • CI: Chroot integration tests pass
  • CI: Smoke chroot tests pass
  • Release v0.13.10 and update gh-aw smoke-codex to use it

🤖 Generated with Claude Code

Squid crashes with a segfault (comm.cc:1583 assertion failure) when
proxying concurrent MCP Streamable HTTP (SSE) connections from Codex
to the MCP gateway.

Root cause: Codex resolves host.docker.internal to 172.30.0.1 (the AWF
network gateway) instead of 172.17.0.1 (Docker bridge). The existing
iptables bypass only covers 172.17.0.1, so traffic to 172.30.0.1:80
gets DNAT-redirected to Squid, which crashes on concurrent SSE streams.

Fix: Dynamically detect the container's default network gateway via
`route -n` and add it to the iptables bypass list alongside
host.docker.internal, so MCP gateway traffic goes directly to the host.

Locally reproduced: before fix Squid crashes with the exact CI error;
after fix all SSE+POST traffic bypasses Squid and Squid stays alive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 6, 2026 08:03
@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

🎬 THE ENDSmoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 82.10% 82.10% ➡️ +0.00%
Statements 82.14% 82.14% ➡️ +0.00%
Functions 81.95% 81.95% ➡️ +0.00%
Branches 75.44% 75.44% ➡️ +0.00%

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

C++ Build Test Results

Project CMake Build Status
fmt PASS
json PASS

Overall: PASS

All C++ projects built successfully.

AI generated by Build Test C++

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Deno Build Test Results

Project Tests Status
oak 1/1 ✅ PASS
std 1/1 ✅ PASS

Overall: ✅ PASS

All Deno tests passed successfully.

AI generated by Build Test Deno

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

✅ Smoke Test Results

Last 2 Merged PRs:

Tests:

  • ✅ GitHub MCP: Retrieved PR data
  • ✅ Playwright: Page title verified ("GitHub · Change is constant. GitHub keeps you ahead.")
  • ✅ File I/O: Test file created and verified
  • ✅ Bash: Command execution successful

Status: PASS 🎉

cc @Mossaka

AI generated by Smoke Copilot

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Smoke Test Results (Claude)

✅ GitHub MCP: Retrieved last 2 merged PRs

  • "chore: upgrade gh-aw workflows to v0.42.0 and fix strict mode violations"
  • "docs: add awf logs command documentation"

✅ Playwright: Navigated to https://github.com, title contains "GitHub"

✅ File Writing: Created /tmp/gh-aw/agent/smoke-test-claude-21743293656.txt

✅ Bash Tool: Verified file content

Status: PASS

AI generated by Smoke Claude

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Go Build Test Results

Project Download Tests Status
color 1/1 PASS
env 1/1 PASS
uuid 1/1 PASS

Overall: PASS

All Go projects successfully downloaded dependencies and passed tests.

AI generated by Build Test Go

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical Squid proxy crash that occurs when concurrent MCP Server-Sent Events (SSE) connections are proxied through Squid. The fix adds the container's network gateway IP (172.30.0.1) to the iptables bypass list, preventing MCP traffic destined for host.docker.internal from being redirected through Squid.

Changes:

  • Adds automatic detection of the container's default network gateway using route -n
  • Bypasses Squid for the network gateway IP when it differs from the Docker bridge gateway
  • Prevents Squid assertion failures (comm.cc:1583) caused by SSE connection handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

🧪 Bun Build Test Results

Status: ❌ FAILED

Bun test runner crashed in both projects due to containerized environment limitations.

Project Install Tests Status
elysia N/A (no deps) ❌ Crash FAIL
hono N/A (no deps) ❌ Crash FAIL

Error Details:

bun test v1.3.8 (b64edcb4)
Aborted (core dumped)
Exit code: 134

Root Cause: Bun's test runner requires kernel capabilities not available in GitHub Actions containers. This is a known limitation when running Bun in restricted environments.

Overall: ❌ FAIL

⚠️ Tests cannot pass in current environment - Bun requires additional system capabilities for test execution.

AI generated by Build Test Bun

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Node.js Build Test Results

Project Install Tests Status
clsx PASS PASS
execa PASS PASS
p-limit PASS PASS

Overall: PASS

All projects installed successfully and passed their test suites.

AI generated by Build Test Node.js

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Security Review: Network Gateway Bypass

I've reviewed this PR and identified a security concern that should be addressed before merging.

Issue: Overly Permissive Gateway Bypass

The new bypass rules for the network gateway (172.30.0.1) create an unfiltered egress path when AWF_ENABLE_HOST_ACCESS is enabled.

Location: containers/agent/setup-iptables.sh:143-145

echo "[iptables] Allow direct traffic to network gateway (${NETWORK_GATEWAY_IP}) - bypassing Squid..."
iptables -t nat -A OUTPUT -d "$NETWORK_GATEWAY_IP" -j RETURN
iptables -A OUTPUT -d "$NETWORK_GATEWAY_IP" -j ACCEPT  # ⚠️ Allows ALL ports/protocols

Security Impact

  1. Bypasses domain filtering: Traffic to 172.30.0.1 no longer goes through Squid, so domain allowlist is not enforced
  2. All ports allowed: The OUTPUT -j ACCEPT rule permits connections to ANY port on the host, not just HTTP/HTTPS (80/443)
  3. Protocol unrestricted: No protocol restrictions - allows TCP, UDP, or any other protocol

This means malicious code in the agent container can connect to any service running on the Docker host (SSH on 22, databases on 3306/5432, etc.) by targeting 172.30.0.1, completely bypassing the firewall's protection.

Root Cause Analysis

The issue stems from needing to bypass Squid for MCP SSE traffic while maintaining security boundaries. The current implementation is too broad.

Recommended Mitigations

Option 1: Port-specific bypass (Recommended)
Only allow traffic to the specific port where the MCP gateway runs:

# Replace line 145 with port-specific rule
iptables -A OUTPUT -p tcp -d "$NETWORK_GATEWAY_IP" --dport 80 -j ACCEPT

Option 2: Restrict to HTTP/HTTPS only
If MCP gateway could run on multiple ports, at least limit to HTTP/HTTPS:

# Replace line 145 with protocol-specific rules
iptables -A OUTPUT -p tcp -d "$NETWORK_GATEWAY_IP" --dport 80 -j ACCEPT
iptables -A OUTPUT -p tcp -d "$NETWORK_GATEWAY_IP" --dport 443 -j ACCEPT

Option 3: Explicit port configuration
Add a configuration option --mcp-gateway-port to specify the exact port:

MCP_GATEWAY_PORT="${AWF_MCP_GATEWAY_PORT:-80}"
iptables -A OUTPUT -p tcp -d "$NETWORK_GATEWAY_IP" --dport "$MCP_GATEWAY_PORT" -j ACCEPT

Questions for Discussion

  1. What port does the MCP gateway actually listen on? (The PR mentions port 80 based on Server: BaseHTTP/0.6 Python)
  2. Is there a reason why the bypass needs to allow ALL ports instead of just the MCP gateway port?
  3. Should we also add a host-level iptables rule to further restrict what the agent can access on the gateway IP?

Recommendation: Please update the bypass rule to be port-specific rather than allowing all traffic to the gateway IP. This maintains security while fixing the Squid SSE crash issue.

AI generated by Security Guard

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Chroot Runtime Version Test Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.12 Python 3.12.3 ❌ NO
Node.js v24.13.0 v20.20.0 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall Status: ❌ FAILED

Some runtime versions differ between host and chroot environments. Go matches successfully, but Python and Node.js versions don't align.

AI generated by Smoke Chroot

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

❌ Rust Build Test Failed

Status: FAILED - Unable to execute Rust toolchain in AWF container environment

Error Details

The Rust toolchain (cargo/rustup) cannot be executed within the AWF firewall container. All attempts to run cargo commands result in unexpected behavior where the cargo binary outputs bash version information instead of executing properly.

Projects Not Tested

Project Build Tests Status
fd NOT TESTED
zoxide NOT TESTED

Overall: FAILED

Technical Details

  • Repository clone: ✅ Success
  • Rust installation: ⚠️ Completed but non-functional
  • Cargo execution: ❌ Failed - binary produces unexpected output
  • Environment: AWF chroot container (AWF_CHROOT_ENABLED=true)

Root Cause

When attempting to execute /home/runner/.cargo/bin/cargo (or rustup), the binary outputs:

GNU bash, version 5.2.21(1)-release (x86_64-pc-linux-gnu)

This occurs even when:

  • Using absolute paths
  • Executing via dynamic linker (/lib64/ld-linux-x86-64.so.2)
  • Running through Python subprocess
  • Using various shell invocation methods

Recommendation

The Rust toolchain needs to be pre-installed on the host system before the AWF container starts, or the test should run outside the AWF container environment. The issue appears to be related to how binaries are executed within the chroot-enabled AWF container.

AI generated by Build Test Rust

Address security review: narrow the OUTPUT ACCEPT rule from all
ports/protocols to only TCP port 80 (where MCP gateway runs).
The NAT RETURN rule remains broad since DNAT only catches 80/443.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

🎬 THE ENDSmoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Claude Smoke Test Results

Last 2 merged PRs:

✅ GitHub MCP test
✅ Playwright test (title contains "GitHub")
✅ File write test
✅ Bash test

Status: PASS

AI generated by Smoke Claude

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Deno Build Test Results

Project Tests Status
oak 1/1 ✅ PASS
std 1/1 ✅ PASS

Overall: ✅ PASS

All Deno tests completed successfully.

AI generated by Build Test Deno

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Node.js Build Test Results

Project Install Tests Status
clsx PASS PASS
execa PASS PASS
p-limit PASS PASS

Overall: PASS

All Node.js projects successfully installed dependencies and passed their test suites.

AI generated by Build Test Node.js

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Go Build Test Results

Project Download Tests Status
color 1/1 PASS
env 1/1 PASS
uuid 1/1 PASS

Overall: PASS

All Go projects successfully downloaded dependencies and passed tests.

AI generated by Build Test Go

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Smoke Test Results

Last 2 Merged PRs:

Test Results:

  • ✅ GitHub MCP: Retrieved PR data
  • ✅ Playwright: GitHub page loaded (title: "GitHub · Change is constant. GitHub keeps you ahead. · GitHub")
  • ✅ File Writing: Created /tmp/gh-aw/agent/smoke-test-copilot-21743665454.txt
  • ✅ Bash Tool: Verified file content

Status: PASS

cc @Mossaka

AI generated by Smoke Copilot

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Build Test: Bun - FAILED

Status:FAIL

Results

Project Install Tests Status
elysia N/A* 0/0 ❌ CRASH
hono N/A* 0/0 ❌ CRASH

Overall: FAIL - Test runner crashes in GitHub Actions environment

Error Details

Issue: Bun test runner (v1.3.8) crashes with core dump on both projects:

bun test v1.3.8 (b64edcb4)
Aborted (core dumped)
Exit code: 134

Environment: GitHub Actions runner

  • Bun installation: ✅ Success (v1.3.8)
  • Repository clone: ✅ Success
  • Test execution: ❌ Core dump

* bun install also fails with internal error (NotDir), likely related to same compatibility issue

Recommendation

This appears to be a compatibility issue between Bun v1.3.8 and the GitHub Actions runner environment. Consider:

  • Testing with Docker container with Bun pre-installed
  • Testing with different Bun version
  • Reporting issue to Bun team regarding GitHub Actions compatibility

AI generated by Build Test Bun

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Build Test: Java - FAILED ❌

Status: Unable to execute Java tests due to environment configuration issue.

Error Details

The Java runtime environment is not functioning correctly in the GitHub Actions runner:

  • Running java -version outputs bash version information instead of Java version
  • Maven compilation cannot proceed without a working Java runtime
  • Java binary exists at /opt/hostedtoolcache/Java_Temurin-Hotspot_jdk/21.0.10-7/x64/bin/java but is not executable

Investigation Results

Check Result
Repository Clone ✅ Success
Java Binary Present ✅ Found at /opt/hostedtoolcache/Java_Temurin-Hotspot_jdk/21.0.10-7/x64/bin/java
Java Execution ❌ Failed - outputs bash version instead
Maven Available ✅ Found at /usr/share/apache-maven-3.9.12
Maven Execution ❌ Failed - depends on Java

Next Steps

This appears to be a GitHub Actions runner or shell environment configuration issue rather than a code problem. The workflow may need:

  1. Explicit Java setup step before running tests
  2. Investigation of shell initialization scripts
  3. Environment variable configuration review

AI generated by Build Test Java

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Chroot Version Comparison Test Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.12 Python 3.12.3 ❌ NO
Node.js v24.13.0 v20.20.0 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall Status: ❌ Tests FAILED - Not all runtime versions match between host and chroot environments.

The chroot mode successfully uses host binaries, but Python and Node.js versions differ. This is expected behavior when the host and container have different installations.

AI generated by Smoke Chroot

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

Build Test: Rust - FAILED ❌

Unable to execute Rust build tests due to missing Rust toolchain.

Issue

The Rust toolchain (cargo/rustc) is not available in the GitHub Actions runner environment, and installation via rustup failed within the AWF firewall container.

Root Cause

When attempting to install Rust via the official rustup installer (https://sh.rustup.rs), the downloaded rustup binary was corrupted - it appears to be a bash binary instead of the actual rustup installer. This suggests a potential issue with:

  • Network proxying through the AWF firewall interfering with binary downloads
  • Missing required domains in the firewall whitelist for Rust installation
  • Or a fundamental incompatibility with installing development toolchains inside the AWF container

Attempted Solutions

  1. ✅ Successfully cloned test repository: Mossaka/gh-aw-firewall-test-rust
  2. ❌ Attempted to install Rust via curl https://sh.rustup.rs | sh
  3. ❌ Verified rustup binary was corrupted (contained bash instead of rustup)
  4. ❌ Could not proceed with building fd or zoxide projects

Recommendation

To run Rust build tests, one of the following is needed:

  • Pre-install Rust toolchain in the GitHub Actions runner before entering the AWF container
  • Add required Rust installation domains to the AWF firewall whitelist
  • Use a different approach for testing Rust projects within the firewall environment

Test Results

Project Build Tests Status
fd N/A FAIL - No Rust toolchain
zoxide N/A FAIL - No Rust toolchain

Overall: FAIL

AI generated by Build Test Rust

@Mossaka Mossaka merged commit eec2739 into main Feb 6, 2026
81 checks passed
@Mossaka Mossaka deleted the fix/bypass-squid-for-network-gateway branch February 6, 2026 08:33
Mossaka added a commit to github/gh-aw that referenced this pull request Feb 6, 2026
Fixes Squid crash on MCP Streamable HTTP (SSE) traffic from Codex.
See github/gh-aw-firewall#553 for details.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant