English | 日本語 | 简体中文 | 한국어 | Español
A personal macOS app that transcribes MacBook microphone audio in real time at conference venues using Apple's built-in macOS APIs, and displays the result as translated subtitles. The recognition language and target language can be freely chosen from the languages supported by the OS (default: English → Japanese).
- Transcription:
Speech.framework(SpeechAnalyzer/SpeechTranscriberon macOS 26, on-device) - Translation:
Translation.framework(TranslationSession, on-device) - UI: SwiftUI two-pane layout (source transcript / translated text)
📖 For detailed usage (registering technical terms, on-site tips, troubleshooting), see docs/usage.md.
- macOS 26.0 or later / Apple Silicon
- Xcode 26 or later (for building)
- First launch only: a network connection is required to download the speech recognition model and the translation model
# Build
xcodebuild -project ConfLingo.xcodeproj -scheme ConfLingo -configuration Debug build
# Launch (open the .app generated under DerivedData)
open ~/Library/Developer/Xcode/DerivedData/ConfLingo-*/Build/Products/Debug/ConfLingo.appRun tests:
xcodebuild test -project ConfLingo.xcodeproj -scheme ConfLingo -destination 'platform=macOS'- Microphone: A microphone permission dialog appears the first time you press Start. Transcription does not work without it
- Speech recognition model: If the recognition model is not installed at first launch, the download starts automatically (with progress display)
- Translation model: If the translation model is not installed, the standard OS download confirmation dialog appears
To reset the microphone permission:
tccutil reset Microphone com.gavrri.conflingoIf you accidentally denied the permission, enable ConfLingo under System Settings > Privacy & Security > Microphone.
- Launch the app (on first launch, model checks and downloads run)
- Choose the recognition language and target language with the language pickers (changeable only while stopped; changing them automatically triggers an availability check and model download)
- Enter a session name if needed
- In the technical terms field, enter event-specific terms (speaker names, product names, technical jargon) separated by commas. They are registered as contextual strings for speech recognition at Start, improving recognition accuracy for proper nouns (preset with terms for Code with Claude Tokyo by default; changes take effect from the next Start)
- Press Start (⌘R) to begin transcription
- Recognition pane: in-progress (partial) sentences are shown dimmed and italic, then appended to the history once finalized
- Translation pane: only finalized source sentences are translated, appended per finalized sentence
- Press Stop (⌘R) to stop. Pressing Start again appends to the existing history
- Save Markdown saves the entire session as Markdown
- A− / A+ (⌘− / ⌘+) adjusts the font size; the "always on top" checkbox keeps the window in front
- Clear discards the history (only while stopped)
Share the repository URL and have the recipient run the following. No Gatekeeper warning appears.
git clone <repository URL> && cd conflingo
xcodebuild -project ConfLingo.xcodeproj -scheme ConfLingo build
open ~/Library/Developer/Xcode/DerivedData/ConfLingo-*/Build/Products/Debug/ConfLingo.app# 1. Release build (fix the output path to build/)
xcodebuild -project ConfLingo.xcodeproj -scheme ConfLingo \
-configuration Release -derivedDataPath build build
# 2. Zip with ditto (zip -r can break signatures and extended attributes)
ditto -c -k --sequesterRsrc --keepParent \
build/Build/Products/Release/ConfLingo.app dist/ConfLingo-1.0.zipSend the resulting dist/ConfLingo-1.0.zip via AirDrop. Because the app is ad-hoc signed (not notarized), the recipient must bypass Gatekeeper on first launch:
- Unzip and double-click → "cannot be opened because the developer cannot be verified"
- System Settings > Privacy & Security > "Open Anyway"
- After that, it launches normally (developers can also run
xattr -dr com.apple.quarantine ConfLingo.app)
- macOS 26 or later + Apple Silicon (does not launch on earlier macOS versions)
- Network required on first launch: each Mac downloads the recognition and translation models (several hundred MB). In case the venue Wi-Fi is weak, ask recipients to launch the app as soon as they receive it
- Microphone permission dialog on first Start → "Allow"
- Venue audio is assumed to be captured by the MacBook's built-in microphone. Internal Mac audio (system audio) such as Zoom / YouTube cannot be captured
- In-progress (partial) sentences are not translated by design (to avoid unstable translations). Translation lags finalized sentences by roughly 2–5 seconds
- Languages can only be changed while stopped. Switching languages keeps the existing subtitle history (the Markdown header records the language pair at save time)
- Speaker diarization, summarization, and audio recording are not supported
- No code signing / notarization for distribution (intended for personal use with local builds)
- Recognition accuracy is heavily affected by microphone position and ambient noise. Point the MacBook toward the speakers and sit near the front if possible
AVAudioEngine microphone input (hardware format)
└ AVAudioConverter converts to SpeechAnalyzer's preferred format
└ AsyncStream<AnalyzerInput> → SpeechAnalyzer / SpeechTranscriber (volatileResults)
├ partial → SessionStore.volatileText (shown dimmed in the recognition pane)
└ final → finalized into SessionStore.segments → TranslationCoordinator queue
└ TranslationSession inside the .translationTask closure translates sequentially
└ SessionStore.applyTranslation → shown in the translation pane
| File | Responsibility |
|---|---|
Models/SessionStore.swift |
Single source of truth for the UI. Segment history, partials, dedup |
Models/KeywordParser.swift |
Parses the technical terms field + event presets |
Models/LanguageCatalog.swift |
Language display names and target candidate filtering |
Services/AudioCaptureService.swift |
Microphone input, format conversion, permission requests |
Services/SpeechTranscriptionService.swift |
SpeechAnalyzer / SpeechTranscriber wiring |
Services/TranslationCoordinator.swift |
Translation queue (ID dedup + AsyncStream) |
Services/ModelAvailabilityService.swift |
Availability checks and model downloads at launch |
Export/MarkdownExporter.swift |
Markdown generation (pure function) |