Skip to content

[codex] improve iOS realtime talk mode#86355

Merged
ngutman merged 9 commits into
mainfrom
ios-realtime-talk-secret-scope
May 25, 2026
Merged

[codex] improve iOS realtime talk mode#86355
ngutman merged 9 commits into
mainfrom
ios-realtime-talk-secret-scope

Conversation

@ngutman
Copy link
Copy Markdown
Member

@ngutman ngutman commented May 25, 2026

Summary

  • Add the direct iOS realtime talk path using gateway-issued client sessions instead of relaying voice through the gateway.
  • Replace the full-screen talk overlay with the compact toolbar tray/status surface and permission prompt flow.
  • Keep the voice waveform responsive to realtime microphone speech and assistant playback, including a playback-drain grace period so speaking animation does not stop before audio finishes.
  • Remove the unused old talk orb overlay and keep gateway talk-client/shared surfaces out of this PR cleanup.

Verification

  • swiftformat --config config/swiftformat apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift
  • swiftlint lint --config apps/ios/.swiftlint.yml apps/ios/Sources/HomeToolbar.swift apps/ios/Sources/RootCanvas.swift apps/ios/Sources/Voice/TalkModeManager.swift apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift
  • git diff --check
  • xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -configuration Debug -destination 'id=6A19D82B-4EA6-5D67-B3A7-0AB3B71B550C' build
  • xcrun devicectl device install app --device 6A19D82B-4EA6-5D67-B3A7-0AB3B71B550C <built OpenClaw.app>
  • xcrun devicectl device process launch --device 6A19D82B-4EA6-5D67-B3A7-0AB3B71B550C ai.openclaw.ios.test.guti-gzs353x62e

Known warning: the full iOS build still reports the pre-existing OnboardingWizardView.swift type_body_length SwiftLint warning.

Real behavior proof

Behavior addressed: iOS talk mode can request talk access, start direct realtime voice, show status above the toolbar, animate on microphone speech, and keep the speaking waveform active until assistant playback drains.

Real environment tested: physical iPhone target 6A19D82B-4EA6-5D67-B3A7-0AB3B71B550C with bundle ai.openclaw.ios.test.guti-gzs353x62e.

Exact steps or command run after this patch: built the iOS app with xcodebuild, installed it with xcrun devicectl device install app, and launched it with xcrun devicectl device process launch.

Evidence after fix: build, install, and launch all completed successfully; targeted SwiftLint on the changed talk UI/realtime files passed with 0 violations.

Observed result after fix: the current branch build is installed and running on the device for manual voice-mode testing.

What was not tested: no automated audio-output duration assertion; final waveform/audio sync is validated manually on the physical device.

@openclaw-barnacle openclaw-barnacle Bot added app: ios App: ios size: XL maintainer Maintainer-authored PR labels May 25, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 25, 2026

Codex review: found issues before merge. Reviewed May 25, 2026, 5:38 AM ET / 09:38 UTC.

Summary
The PR adds a direct iOS WebRTC realtime Talk path, toolbar permission/status UI, gateway Talk permission upgrade handling, WebRTC SwiftPM dependency wiring, and related bootstrap-scope tests.

PR surface: Source +1, Tests +10, Other +2053. Total +2064 across 17 files.

Reproducibility: yes. for the blocking review finding: current main routes .realtimeRelay through RealtimeTalkRelaySession, while the PR maps that same execution mode into realtimeWebRTCEnabled. I did not run a physical iOS gateway-relay scenario in this read-only review.

Review metrics: 3 noteworthy metrics.

  • PR surface: 17 files, +2321/-257. The branch spans iOS UI, voice runtime, gateway auth/bootstrap scopes, tests, and package configuration, so it needs cross-surface review.
  • Realtime ownership change: 1 gateway-owned relay path replaced by 1 client-owned WebRTC path. Changing who owns realtime transport affects compatibility, provider routing, and security boundaries before merge.
  • External dependency: 1 SwiftPM package added, 0 iOS Package.resolved. A new WebRTC dependency on the voice/auth path is supply-chain-sensitive even with an exact version.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦞 diamond lobster
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Preserve the Gateway-owned relay path for executionMode == .realtimeRelay, or document and get explicit maintainer approval for removing that compatibility.
  • Record maintainer/security acceptance of the direct client-secret WebRTC boundary.
  • Confirm the WebRTC dependency resolution policy for iOS before landing.

Risk before merge

  • Existing iOS setups configured for talk.realtime.transport: gateway-relay can be silently moved from Gateway-owned talk.session.* relay to client-owned WebRTC talk.client.*, changing provider support, permissions, and failure modes.
  • The direct WebRTC path moves a gateway-issued provider client secret and realtime audio exchange onto the iOS device, so maintainer security acceptance should be explicit before merge.
  • The PR adds a third-party SwiftPM WebRTC dependency on a voice/auth-sensitive path; exactVersion helps, but dependency and package-resolution policy still need maintainer review.

Maintainer options:

  1. Preserve gateway-relay before merge (recommended)
    Keep existing executionMode == .realtimeRelay configs on the Gateway-owned relay path, and enable the new WebRTC path only for client-owned WebRTC/provider-compatible configs.
  2. Accept a deliberate iOS transport migration
    Maintainers can intentionally migrate iOS gateway-relay configs to client-owned WebRTC, but the PR should state the upgrade impact and expected supported provider set.
  3. Pause for security and dependency review
    If the client-secret-on-device boundary or new WebRTC dependency is not yet approved, pause the branch until the intended iOS Talk ownership model is settled.

Next step before merge
This protected maintainer PR needs author/maintainer judgment on gateway-relay compatibility plus security/dependency review, not cleanup closure or an automated repair lane.

Security
Needs attention: The diff introduces a direct client-secret WebRTC boundary and a new WebRTC dependency on a voice/auth-sensitive iOS path, so security/dependency approval is still needed.

Review findings

  • [P1] Preserve gateway-relay configs — apps/ios/Sources/Voice/TalkModeManager.swift:2347
Review details

Best possible solution:

Land the direct iOS WebRTC path only after preserving or explicitly retiring gateway-relay compatibility, and after maintainer security/dependency signoff on the client-secret WebRTC boundary.

Do we have a high-confidence way to reproduce the issue?

Yes for the blocking review finding: current main routes .realtimeRelay through RealtimeTalkRelaySession, while the PR maps that same execution mode into realtimeWebRTCEnabled. I did not run a physical iOS gateway-relay scenario in this read-only review.

Is this the best way to solve the issue?

No; adding direct WebRTC may be the right direction, but this merge shape should preserve existing gateway-relay behavior or explicitly document and approve the compatibility break.

Full review comments:

  • [P1] Preserve gateway-relay configs — apps/ios/Sources/Voice/TalkModeManager.swift:2347
    Current main treats talk.realtime.transport: gateway-relay as a Gateway-owned relay path using RealtimeTalkRelaySession, but this change sets realtimeWebRTCEnabled for executionMode == .realtimeRelay. Existing iOS gateway-relay configs will silently switch to talk.client.create/WebRTC and can fail for providers or deployments that rely on the documented Gateway-owned talk.session.* relay.
    Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 033693843c26.

Label changes

Label changes:

  • add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦞 diamond lobster and patch quality is 🧂 unranked krab.
  • remove rating: 🦐 gold shrimp: Current PR rating is rating: 🧂 unranked krab, so this older rating label is no longer current.

Label justifications:

  • P2: This is a normal-priority but user-facing iOS Talk feature with bounded merge blockers in compatibility and security review.
  • merge-risk: 🚨 compatibility: Existing gateway-relay Talk configs can be rerouted to the new WebRTC client path instead of the current Gateway-owned relay path.
  • merge-risk: 🚨 auth-provider: The PR changes operator Talk secret scope handling and provider client-session routing for iOS Talk.
  • merge-risk: 🚨 security-boundary: The PR puts a gateway-issued realtime provider client secret and a new WebRTC dependency on the iOS device path.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦞 diamond lobster and patch quality is 🧂 unranked krab.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (terminal): The PR body reports a physical iPhone build, install, launch, and manual voice-mode validation after the patch; this is sufficient maintainer PR proof but does not resolve the compatibility/security review blockers.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body reports a physical iPhone build, install, launch, and manual voice-mode validation after the patch; this is sufficient maintainer PR proof but does not resolve the compatibility/security review blockers.
Evidence reviewed

PR surface:

Source +1, Tests +10, Other +2053. Total +2064 across 17 files.

View PR surface stats
Area Files Added Removed Net
Source 1 1 0 +1
Tests 2 20 10 +10
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 14 2300 247 +2053
Total 17 2321 257 +2064

Security concerns:

  • [medium] Direct provider client secret on device — apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift:298
    The new WebRTC path sends the gateway-issued client secret as a bearer token from iOS to the provider endpoint; this may be intended, but it changes the auth boundary from gateway relay to client-owned realtime transport.
    Confidence: 0.82
  • [low] New WebRTC dependency on voice path — apps/ios/project.yml:18
    The iOS app adds https://github.com/stasel/WebRTC.git at exactVersion 147.0.0 without an iOS Package.resolved file, so maintainers should explicitly accept the dependency and resolution policy before merge.
    Confidence: 0.78

Acceptance criteria:

  • swiftformat --config config/swiftformat apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift
  • swiftlint lint --config apps/ios/.swiftlint.yml apps/ios/Sources/HomeToolbar.swift apps/ios/Sources/RootCanvas.swift apps/ios/Sources/Voice/TalkModeManager.swift apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift
  • xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -configuration Debug build
  • node scripts/run-vitest.mjs src/infra/device-bootstrap.test.ts src/shared/device-bootstrap-profile.test.ts

What I checked:

  • Root policy read and applied: Root AGENTS.md was read fully; its compatibility/security guidance applies because the PR changes auth/session state, provider routing, setup/bootstrap scopes, fallback behavior, and dependency surfaces. (AGENTS.md:1, 033693843c26)
  • Live PR state: GitHub API shows the PR is open, author association MEMBER, labeled maintainer, mergeable clean, and currently at head d7ff4b8 with 17 changed files and +2321/-257. (d7ff4b825638)
  • Current main gateway-relay behavior: Current main has a dedicated iOS realtime relay path: gateway-relay config calls startRealtimeRelay(), creates RealtimeTalkRelaySession, and uses talk.session.* through the gateway. (apps/ios/Sources/Voice/TalkModeManager.swift:239, 033693843c26)
  • Protocol contract separates relay and client-owned realtime: Gateway protocol docs distinguish Gateway-owned talk.session.create for realtime/gateway-relay from client-owned talk.client.create for WebRTC/provider-websocket sessions. Public docs: docs/gateway/protocol.md. (docs/gateway/protocol.md:378, 033693843c26)
  • PR maps relay config into WebRTC start path: The PR marks any non-default provider or executionMode == .realtimeRelay as realtimeWebRTCEnabled, so existing gateway-relay config reaches the new client-owned WebRTC path instead of the current relay path. (apps/ios/Sources/Voice/TalkModeManager.swift:2347, d7ff4b825638)
  • Direct client-secret WebRTC path: The new iOS WebRTC session sends the gateway-issued clientSecret as a bearer Authorization header while exchanging the SDP offer directly with the provider endpoint. (apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift:298, d7ff4b825638)

Likely related people:

  • ngutman: Authored the PR commits and also has a recent merged current-main commit on the Talk secret bootstrap handoff adjacent to the operator scope path this PR changes. (role: current feature owner and adjacent auth/bootstrap contributor; confidence: high; commits: fa4278a7d929, d7ff4b825638, c791e4242bc8; files: apps/ios/Sources/Voice/TalkModeManager.swift, apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift, src/shared/device-bootstrap-profile.ts)
  • Shakker: The shallow current checkout attributes the baseline iOS Talk manager and relay files to the grafted baseline commit, so this is a routing hint for older current-main behavior rather than precise authorship. (role: baseline area contributor; confidence: low; commits: 0eead19feca4; files: apps/ios/Sources/Voice/TalkModeManager.swift, apps/ios/Sources/Voice/RealtimeTalkRelaySession.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@ngutman ngutman marked this pull request as ready for review May 25, 2026 06:45
@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 25, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 25, 2026

ClawSweeper PR egg

🔥 Warming up: real-behavior proof passed; findings, security review, or rank-up moves are still in progress.

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.
What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86bd10869c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/ios/Sources/Model/NodeAppModel.swift Outdated
Comment thread apps/ios/Sources/Voice/TalkRealtimeWebRTCSession.swift Outdated
@ngutman ngutman force-pushed the ios-realtime-talk-secret-scope branch from 86bd108 to 8bfe3bb Compare May 25, 2026 06:55
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 25, 2026
@ngutman
Copy link
Copy Markdown
Member Author

ngutman commented May 25, 2026

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 25, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. labels May 25, 2026
@ngutman ngutman force-pushed the ios-realtime-talk-secret-scope branch from bdf1887 to 3f5aedb Compare May 25, 2026 10:26
@github-actions github-actions Bot removed the dependencies-changed PR changes dependency-related files label May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot removed docs Improvements or additions to documentation channel: discord Channel integration: discord channel: matrix Channel integration: matrix channel: slack Channel integration: slack channel: telegram Channel integration: telegram channel: whatsapp-web Channel integration: whatsapp-web app: web-ui App: web-ui gateway Gateway runtime extensions: memory-core Extension: memory-core cli CLI command changes scripts Repository scripts commands Command implementations docker Docker and sandbox tooling agents Agent runtime and tooling extensions: device-pair extensions: qa-lab extensions: codex extensions: lmstudio plugin: migrate-hermes plugin: migrate-claude extensions: oc-path extensions: diffs extensions: xai labels May 25, 2026
@ngutman ngutman merged commit 9ca52ce into main May 25, 2026
42 of 45 checks passed
@ngutman
Copy link
Copy Markdown
Member Author

ngutman commented May 25, 2026

Merged via squash.

Thanks @ngutman!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: ios App: ios maintainer Maintainer-authored PR merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: XL status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant