AutoPresenter is a macOS prototype for AI-assisted presentation rehearsal. It listens while you speak, keeps the model grounded on the current slide, and interprets structured Realtime commands for highlighting and navigation.
The project is intentionally still in prototype mode: no production-grade Keynote/PowerPoint actuation flow yet.
- Realtime speech session via WebRTC (
gpt-realtimeby default) - JSON deck loading with multiple slide layouts
- Presenter window + main control window
- In-app slide editor with drag reorder, add/delete, and save
- Local command safety gate before any command is accepted
- Activity and command logging (including mirrored runtime log file)
- macOS 14+
- Xcode 26+
- XcodeGen
- OpenAI API key with Realtime access
Install XcodeGen:
brew install xcodegen- Set API key:
export OPENAI_API_KEY='sk-...'or create ~/.api-keys:
{
"OPENAI_API_KEY": "sk-..."
}- Generate project and open it:
xcodegen generate
open AutoPresenter.xcodeproj-
Build and run target
AutoPresenter. -
In the app:
- Open a deck (
File > Open…) - Start recording (
File > Start RecordingorCmd+R) - Open presentation window (
File > Show PresentationorCmd+P) - Speak through your talk and monitor decisions in the activity feed
The loader accepts both a simple prototype schema and a richer schema with layouts. Unknown/extra fields are tolerated.
{
"presentation_title": "My Talk",
"language": "en",
"slides": [
{
"index": 1,
"title": "Intro",
"bullets": ["Point A", "Point B"]
}
]
}{
"deckTitle": ["My Talk"],
"slides": [
{
"layout": "bullets",
"title": ["Slide title"],
"subtitle": ["Optional subtitle"],
"bullets": ["One", "Two"]
}
]
}Supported layouts:
titlebulletsquoteimagetwoColumn
Expected model tool payload:
{
"commands": [
{
"action": "mark | next | previous | goto | stay",
"target_slide": null,
"mark_index": 2,
"confidence": 0.84,
"rationale": "brief factual reason",
"utterance_excerpt": "optional excerpt",
"highlight_phrases": ["exact phrase from slide"]
}
]
}Safety gate validates:
- command shape / compatibility
- confidence threshold
- cooldown and dwell windows
- navigation target validity
Cmd+OOpen deckCmd+SSaveCmd+Shift+SSave As…Cmd+EOpen editorCmd+PShow/Hide presentation windowCmd+RStart/Stop recordingCmd+FToggle fullscreenCmd+ReturnSave slide draft in editor
- Mirrored command log file:
runtime/command-log.txt - File is truncated on app startup, then appended for the current session
Current focus:
- Realtime command pipeline quality
- window/editor UX
- robust in-memory deck editing and persistence
Not in scope yet:
- full production presenter actuation
- integrated finalized fullscreen presenter product flow
MIT. See LICENSE.