synthesis

Point it at a folder of documents for an interactive HTML briefing, or use live transcription mode to capture lectures and meetings into a structured transcript report.

Pipelines

Document mode — point it at a folder of docs and get a strategic HTML briefing:

graph LR
    A[Docs] --> B[Ingest]
    B --> C["Extract\n(2-pass + confirm)"]
    C --> D[Synthesize]
    D --> E[Render]
    E --> F[HTML Briefing]

Live / Transcript mode — capture a lecture or meeting and get a structured report:

graph LR
    A[Audio] --> B["Transcribe\n(Whisper)"]
    B --> C[Extract]
    C --> D[Synthesize Transcript]
    D --> E[Render Transcript]
    E --> F[HTML Report]

Quick Start

# 1. Install dependencies
pip install pyyaml httpx --break-system-packages

# 2. Set your API key
export ANTHROPIC_API_KEY=sk-ant-...

# 3. Edit config.yaml with your doc paths (see Config section below)

# 4. Run
python3 run.py

How it works

There are two pipeline paths depending on the mode.

Document mode (--generate): ingest → extract → synthesize → render — produces a Strategic Briefing.

Ingest walks through all configured paths and reads everything it can. PDFs go through pdftotext, DOCX through pandoc, and anything that isn't binary gets read as plain text. It also supports cloning GitHub repos if you prefix the path with github:user/repo.

Extract takes the ingested documents, shards them into chunks and sends each one through a two-pass LLM extraction. The first pass pulls out items, the second pass re-reads the same document looking for anything the first pass missed. Then a separate confirmation pass runs independently, and only items that both passes agree on and that have a grounded source quote actually make it through. This is what keeps hallucinations out.

Synthesize collects all confirmed items and asks the LLM to produce a structured JSON briefing with a status summary, risk level, each item with its date and urgency and dependencies, a recommended action sequence, any blockers, and key rules or constraints from the documents. Past deadlines are assumed done unless the documents explicitly say otherwise, so it doesn't panic about things you've already handled.

Render takes that JSON and generates a self-contained HTML file with a vertical timeline where a TODAY line separates what's behind from what's ahead, expandable cards for each item, a game plan section and a collapsible drawer for context like rules and people. No frameworks and no build step, just open the file in a browser.

Live/transcript mode (--live, --live-teams): transcribe → extract → synthesize_transcript → render_transcript — produces a Transcript Report.

Transcribe captures audio in real time via faster-whisper (microphone or system loopback) and produces an in-memory transcript. No audio is ever written to disk.

Synthesize Transcript takes the extracted items and asks the LLM to produce a structured JSON transcript report with a summary, topics, key claims, speakers, and takeaways — oriented around what was said rather than what needs to be done.

Render Transcript takes that JSON and generates a self-contained HTML report with a summary section, topic cards, key claims, people, takeaways, and a collapsible full transcript.

Config

api_key_env: ANTHROPIC_API_KEY
model: claude-sonnet-4-20250514
shard_size: 75000

topics:
  - name: My Project
    paths:
      - /path/to/docs
      - github:user/repo

Each topic gets its own section in the briefing. You can add as many as you want and each one will be processed independently.

Run

Command	What it does
`python3 run.py`	Generate briefing and serve
`python3 run.py --generate`	Generate only
`python3 run.py --serve`	Serve existing `index.html` on `localhost:8899`
`python3 run.py --live`	Live mic transcription → report → serve
`python3 run.py --live-teams`	System audio loopback (Teams/Zoom/Meet) → report → serve

Live Transcription — setup, platform support, and config

To use the real-time Whisper transcription mode you need a few extra dependencies.

System prerequisites:

# macOS
brew install portaudio

# Debian / Ubuntu
sudo apt install libportaudio2

Python dependencies:

pip install faster-whisper sounddevice numpy

--live opens the microphone, transcribes in real-time using faster-whisper, and feeds the result into the transcript pipeline extract → synthesize_transcript → render_transcript. No audio files are ever written to disk.

[Microphone] → LiveTranscriber → [in-memory transcript]
                                         |
                               extract → synthesize_transcript → render_transcript → index.html

On first run, faster-whisper will download the large-v3 model (~3 GB). Subsequent runs reuse the cached model.

--live-teams captures whatever audio your system is currently playing instead of the microphone — so it picks up the lecturer's voice directly from Teams, Zoom, or any other app, without any manual device selection.

[System audio output] → loopback → LiveTranscriber → [in-memory transcript]
                                                               |
                                               extract → synthesize_transcript → render_transcript → index.html

Mode	Command	Audio source
In-person lecture	`python run.py --live`	Microphone
Teams / Zoom / Meet	`python run.py --live-teams`	System audio output (loopback)

Platform support

OS	How it works	Setup required
Windows	WASAPI loopback on the default output device	None — works out of the box
Linux	PulseAudio/PipeWire monitor source (auto-detected)	None — works out of the box
macOS	Scans for BlackHole or similar virtual audio cable	`brew install blackhole-2ch` (one time)

macOS one-time setup:

brew install blackhole-2ch
Open Audio MIDI Setup (Spotlight → "Audio MIDI Setup")
Click + → Create Multi-Output Device → check both your speakers and BlackHole 2ch
Set the Multi-Output Device as your system output in System Settings → Sound
Run python3 run.py --live-teams — BlackHole is auto-detected, no config needed

The code scans all audio devices for names containing "blackhole", "loopback", or "virtual" on macOS, and "monitor" on Linux, so no --device flag or config entry is ever required.

Config (config.yaml):

live:
  language: de              # BCP-47 language code; "de" covers Standard German and Swiss German
  initial_prompt: "Grüezi, hüt bespräche mer d'Vorlesig..."  # seeds the decoder with dialect vocabulary
  model_size: large-v3      # whisper model to use
  chunk_seconds: 15         # transcribe every N seconds of captured audio
  topic_name: "Live Lecture" # name used in the generated report

Swiss German dialect support: faster-whisper uses the language: de code for all German variants. Seeding the decoder with initial_prompt text written in Swiss German (e.g. "Grüezi, hüt bespräche mer...") biases the model toward dialect-specific vocabulary and spelling, improving accuracy for Swiss German speakers.

Privacy: no audio is ever saved to disk. All audio is processed in memory and discarded once the transcript is produced.

Cost

Each topic with N document shards makes roughly 3N + 1 API calls since every shard goes through extract, re-extract and confirm, plus one synthesize call at the end. On Anthropic API Tier 1 you're limited to 8k output tokens per minute, so keep workers = 1 in extract.py to avoid getting rate limited. The retry logic in llm.py handles 429s with exponential backoff if it does happen.

Files

run.py         orchestrator
ingest.py      file walker and text extraction
extract.py     two-pass LLM extraction with confirmation
synthesize.py  synthesize() for document briefings; synthesize_transcript() for live transcript reports
render.py      render() for document briefings; render_transcript() for live transcript reports
llm.py         API client with credential scrubbing and retry
transcribe.py  real-time Whisper mic transcription (--live mode)
config.yaml    paths and settings

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
docs		docs
src		src
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

synthesis

Pipelines

Quick Start

Config

Run

Platform support

Files

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

synthesis

Pipelines

Quick Start

Config

Run

Platform support

Files

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages