fix: iOS worker TCP connection failures (issue #79)#84
Conversation
…cking, UIBackgroundModes Agent-Logs-Url: https://github.com/evilsocket/cake/sessions/272a2117-aebc-4a10-a2ab-0348f3040921 Co-authored-by: evilsocket <86922+evilsocket@users.noreply.github.com>
…MEOUT value Agent-Logs-Url: https://github.com/evilsocket/cake/sessions/272a2117-aebc-4a10-a2ab-0348f3040921 Co-authored-by: evilsocket <86922+evilsocket@users.noreply.github.com>
evilsocket
left a comment
There was a problem hiding this comment.
CI failed, check why and fix
…iles field on Windows Agent-Logs-Url: https://github.com/evilsocket/cake/sessions/4cd3dc02-5334-464d-8e1d-e9a2135e8792 Co-authored-by: evilsocket <86922+evilsocket@users.noreply.github.com>
Found two pre-existing CI failures in
|
📊 Code Coverage by Platform & Acceleration
|
…r.rs for Android compat Agent-Logs-Url: https://github.com/evilsocket/cake/sessions/8c7587c9-58e7-4276-80dd-82a6b595cbef Co-authored-by: evilsocket <86922+evilsocket@users.noreply.github.com>
Fixed in 0f1eebb. The Android build was still failing in |
Summary
Investigated and fixed the three root causes of the iOS worker TCP connection failures reported in issue #79.
Root cause analysis
The issue manifests as
EHOSTUNREACH/ECONNREFUSEDwhen the master tries to TCP-connect to the iOS worker, even though UDP discovery succeeds andnc -z(quick connect-then-close) works. Three compounding bugs were identified:No retry on setup connection – the master made exactly one TCP connect attempt; any transient iOS network condition (brief socket initialisation delay after UDP advertisement, app briefly suspended by iOS, etc.) caused immediate failure with no recovery.
Context::from_argsblocked the Tokio runtime – inrun_zero_config_worker(andrun_direct_worker), the model-weight loading callContext::from_args(args)was synchronous inside anasync fn. On a multi-thread runtime this starves the async reactor during load, preventing the pre-boundTcpListenerfrom being polled for incoming inference reconnections in a timely manner.iOS network socket lifecycle – iOS can suspend TCP sockets when the app is briefly in the background.
UIBackgroundModes: [voip]keeps sockets alive.NSBonjourServiceswas missing the_cake._tcpentry, so iOS's Local Network permission did not explicitly cover the TCP inference port.Changes
cake-core/src/cake/sharding/mod.rsWARNlevel with attempt count.cake-mobile/src/lib.rsrun_zero_config_worker: wrapContext::from_argswithtokio::task::spawn_blockingso model loading runs on a dedicated blocking-thread pool, keeping the Tokio async reactor unblocked while weights load.run_direct_worker: samespawn_blockingfix for consistency.JoinError(task panic) is handled gracefully and surfaced as an error status.cake-mobile-app/iosApp/iosApp/Info.plistUIBackgroundModes: [voip]– prevents iOS from suspending the TCP listener when the app is briefly backgrounded during inference._cake._tcptoNSBonjourServices– ensures the iOS Local Network permission covers the TCP inference connection in addition to the UDP discovery service.Testing
All 826 existing tests pass (
cargo test -p cake-core --lib --test unit --test protocol). No new Clippy warnings. CodeQL scan: 0 alerts.