SpeakTwo

Real-time, two-way speech translation for face-to-face conversations.

Put your phone on the table between two people and let them talk in their own languages — SpeakTwo transcribes and translates both sides live, powered by OpenAI's gpt-realtime-translate.

What it is

SpeakTwo is a BYOK (Bring Your Own Key) iOS app for live, in-person interpreting. Two people pick their languages once, hit Start, and talk naturally. The app auto-detects who's speaking which language and streams the translation as they go — no turn-taking, no buttons to pass back and forth.

It runs two simultaneous realtime sessions (one per direction) over WebSocket straight to OpenAI. Your API key lives in the device Keychain; there is no developer-operated server in the loop.

Features

🎙️ Live two-way translation — both directions stream at once; the model auto-detects the source language from the audio.
🔄 Two display modes
- Face-to-face — two panels, the top one rotated 180° for the person sitting across from you.
- Side-by-side chat — a single chronological transcript showing each utterance and its translation.
🎚️ Audio tuning for the room — microphone-scenario presets (phone held close vs. on a table between two people) and optional auto-leveling to balance near and far speakers.
🗂️ Local archive — past sessions are saved on-device so you can review transcripts later.
🔐 Privacy by design — API key in the Keychain, audio sent directly to OpenAI, nothing routed through a third-party backend. See PRIVACY.md.
🩺 Built-in diagnostics — a live log of the realtime connection for debugging.

Supported languages

Translation output supports the 13 languages offered by gpt-realtime-translate:


English	中文 (Mandarin)	Español	Português
Français	Deutsch	Italiano	日本語
한국어	Русский	हिन्दी	Bahasa Indonesia
Tiếng Việt

Requirements

iOS 26.0 or later
Xcode 26+
An OpenAI API key with access to gpt-realtime-translate

Getting started

git clone https://github.com/everettjf/SpeakTwo.git
cd SpeakTwo
open SpeakTwo.xcodeproj

Then in Xcode:

Select your team under Signing & Capabilities (the bundle identifier is com.xnu.speaktwo — change it to your own).
Build and run on a physical device — the iOS Simulator cannot capture microphone audio.
On first launch, open Settings and paste your OpenAI API key (stored in the Keychain), then pick the two languages.
Tap Start and talk.

BYOK & cost

SpeakTwo has no subscription and no backend. You bring your own OpenAI API key and pay OpenAI directly for realtime audio usage. Because translation runs as two concurrent realtime sessions, expect roughly double the per-minute audio cost of a single session. The app tracks usage locally so you can keep an eye on it.

Architecture

A small, dependency-free SwiftUI app:

SpeakTwo/
├── Models/        Language, AppSettings, ChatSession, KeychainStore, TranslationError
├── Views/         Home, Chat, Transcript, Settings, Archive, Onboarding, Diagnostics
└── Services/
    ├── RealtimeTranslator      One WebSocket session → one target language
    ├── TranslationCoordinator  Orchestrates both directions + state
    ├── AudioCaptureService     Microphone capture & framing
    ├── SessionStore            Local persistence of transcripts
    ├── UsageTracker            Local usage accounting
    └── DiagnosticsLogger       Connection log for debugging

Each RealtimeTranslator owns a single connection to gpt-realtime-translate configured for one output language; the TranslationCoordinator runs two of them and merges their transcript deltas into the UI. Transient socket failures are auto-reconnected with exponential backoff.

Testing

Unit tests cover error classification and reconnect backoff:

xcodebuild test \
  -project SpeakTwo.xcodeproj \
  -scheme SpeakTwo \
  -destination 'platform=iOS Simulator,name=iPhone 16'

Deployment

deploy.sh bumps the build number, archives, and uploads to TestFlight:

export APPLE_ID="you@example.com"
export APP_SPECIFIC_PASSWORD="xxxx-xxxx-xxxx-xxxx"   # appleid.apple.com → App-Specific Passwords
./deploy.sh

Privacy

SpeakTwo collects no personal data through any developer-operated server. Your API key stays in the Keychain, audio goes directly to OpenAI, and transcripts are saved only on your device. Full policy: PRIVACY.md.

Contributing

Issues and pull requests are welcome. For bugs or feature ideas, please open an issue.

License

Released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
SpeakTwo.xcodeproj		SpeakTwo.xcodeproj
SpeakTwo		SpeakTwo
SpeakTwoTests		SpeakTwoTests
app-store-screenshots		app-store-screenshots
.gitignore		.gitignore
LICENSE		LICENSE
PRIVACY.md		PRIVACY.md
README.md		README.md
deploy.sh		deploy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeakTwo

What it is

Features

Supported languages

Requirements

Getting started

BYOK & cost

Architecture

Testing

Deployment

Privacy

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpeakTwo

What it is

Features

Supported languages

Requirements

Getting started

BYOK & cost

Architecture

Testing

Deployment

Privacy

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages