Environment
- Master: macOS 26.3.1, Apple M1 (8GB), Cake v0.1.0 built from source (commit f2e6b1ef area, Metal feature)
- Worker: iPad Air 13-inch M3 (iPad15,5), iPadOS 26.2, Cake iOS app built from source (KMP + Rust static lib,
aarch64-apple-ios, Metal feature)
- Network: Same WiFi LAN subnet (10.0.0.x), both devices reachable via ping
Behavior
UDP zero-config discovery works perfectly — the master discovers the iPad worker and assigns layers:
discovered worker 'My Phone' at 10.0.0.246:10128 with 1 GPU(s)
iPad15,5 — 7.4 GiB (~3.0 TFLOPS)
discovery complete: 1 worker(s) found
master: 3.2 TFLOPS — workers: 3.0 TFLOPS — assigning 12 of 24 layers to workers
My Phone (7.4 GiB, 3.0 TFLOPS) → 12 layers (475 MiB)
connecting to worker 'My Phone' at 10.0.0.246:10128 ...
Error: can't connect to 10.0.0.246:10128: No route to host (os error 65)
The subsequent TCP connection to the worker always fails with EHOSTUNREACH (os error 65) or ECONNREFUSED (os error 61).
Diagnostic evidence
SYN probe succeeds, full connect() fails:
# nc -z (SYN probe only) — succeeds every time
$ nc -z -w 3 10.0.0.246 10128
Connection to 10.0.0.246 port 10128 [tcp/bmc-perf-sd] succeeded!
# Python full TCP connect() — fails every time
$ python3 -c "
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(5)
s.connect(('10.0.0.246', 10128))
"
# → [Errno 61] Connection refused
This pattern was 100% reproducible across 10+ attempts. nc -z always reports the port as open. Python socket.connect() and Cake's Rust TcpStream::connect() always fail.
Ruling out network issues:
ping 10.0.0.246 succeeds (56ms avg)
route get 10.0.0.246 routes through en0 (WiFi LAN), not Tailscale
- UDP discovery broadcast on port 10127 works both directions
- iPad Local Network permission is granted (Settings → app → Local Network = ON)
- iPad Developer Mode is enabled
- iPad screen auto-lock is OFF, app is in foreground showing "waiting for master"
Hypothesis
The iOS Cake worker binds TcpListener on 0.0.0.0:10128 but the iOS sandbox or network stack may be interfering with accept(). The port appears open to SYN probes (half-open scan) but full three-way handshakes are refused. This could be:
- iOS sandbox restricting incoming TCP connections after the initial bind
- The
TcpListener not being polled/accepted in time (async runtime issue)
- An iOS network entitlement requirement for accepting incoming connections that isn't in the current entitlements
Steps to reproduce
- Build
cake-mobile for iOS: RUSTC_WRAPPER="" SDKROOT=$(xcrun --sdk iphoneos --show-sdk-path) IPHONEOS_DEPLOYMENT_TARGET=16.0 cargo build --release --target=aarch64-apple-ios -p cake-mobile --features metal
- Build KMP framework:
cd cake-mobile-app && ./gradlew :shared:linkReleaseFrameworkIosArm64
- Deploy to iPad via Xcode
- Start worker with any cluster key
- On macOS master:
cake run <model> --cluster-key <same-key> --discovery-timeout 10
- Discovery succeeds, TCP connection fails
Expected
Master connects to worker on TCP 10128, streams weight shards, distributed inference begins.
Actual
Error: can't connect to 10.0.0.246:10128: No route to host (os error 65)
Environment
aarch64-apple-ios, Metal feature)Behavior
UDP zero-config discovery works perfectly — the master discovers the iPad worker and assigns layers:
The subsequent TCP connection to the worker always fails with
EHOSTUNREACH(os error 65) orECONNREFUSED(os error 61).Diagnostic evidence
SYN probe succeeds, full connect() fails:
This pattern was 100% reproducible across 10+ attempts.
nc -zalways reports the port as open. Pythonsocket.connect()and Cake's RustTcpStream::connect()always fail.Ruling out network issues:
ping 10.0.0.246succeeds (56ms avg)route get 10.0.0.246routes throughen0(WiFi LAN), not TailscaleHypothesis
The iOS Cake worker binds
TcpListeneron0.0.0.0:10128but the iOS sandbox or network stack may be interfering withaccept(). The port appears open to SYN probes (half-open scan) but full three-way handshakes are refused. This could be:TcpListenernot being polled/accepted in time (async runtime issue)Steps to reproduce
cake-mobilefor iOS:RUSTC_WRAPPER="" SDKROOT=$(xcrun --sdk iphoneos --show-sdk-path) IPHONEOS_DEPLOYMENT_TARGET=16.0 cargo build --release --target=aarch64-apple-ios -p cake-mobile --features metalcd cake-mobile-app && ./gradlew :shared:linkReleaseFrameworkIosArm64cake run <model> --cluster-key <same-key> --discovery-timeout 10Expected
Master connects to worker on TCP 10128, streams weight shards, distributed inference begins.
Actual
Error: can't connect to 10.0.0.246:10128: No route to host (os error 65)