Telepathy

Telepathy is silent audio dictation, AKA mouthreading. Allegedly what Q.ai, the AI company bought by Apple for $2B is solving.

A Tauri desktop app for macOS that works as a hold-to-record hotkey overlay: while the hotkey is held, webcam video is captured; on release, the clip is sent to a local Python server for visual speech (lipreading) inference and pasted as text.

Quick start

Install JS dependencies:
```
npm install
```
Create the single Python environment (uv-managed):

./scripts/setup_python_env.sh

On macOS 26+, this script upgrades torch/torchvision/torchaudio from the PyTorch nightly index and then prints MPS availability, to handle newer macOS runtime compatibility before stable wheels catch up.

Download/prepare AutoAVSR assets (repo + checkpoints):
```
./scripts/setup_autoavsr_assets.sh
```

Run the Python server:

.venv/bin/uvicorn server.main:app --reload

Run the app bundle in debug mode (recommended for macOS permissions onboarding):
```
npm run tauri:dev-app
```
This builds src-tauri/target/debug/bundle/macos/Telepathy.app and opens it.

If you want pure hot-reload webview iteration, you can still run:
```
npm run tauri dev
```

On startup, Telepathy opens a small onboarding UI to verify macOS permissions:

Camera (required to capture video)
Accessibility (required to auto-paste text via Cmd+V)

After onboarding, use the global hotkey:

CommandOrControl+Shift+Space (press and hold to record, release to transcribe and paste).

A small overlay pill appears while recording/processing and auto-hides after completion.

The capture pipeline targets 25 FPS and the backend normalizes incoming clips to the model FPS.

Model backend (AutoAVSR)

The backend now supports an AutoAVSR transcriber adapter in server/model.py. It expects a local checkout of:

https://github.com/mpc001/Visual_Speech_Recognition_for_Multiple_Languages

server/model.py auto-detects:

third_party/Visual_Speech_Recognition_for_Multiple_Languages

No extra environment variables are required for the default local setup.

Optional overrides:

export TELEPATHY_MODEL_BACKEND=autoavsr
export TELEPATHY_AUTOAVSR_REPO=/absolute/path/to/Visual_Speech_Recognition_for_Multiple_Languages
# Optional (defaults shown):
# export TELEPATHY_AUTOAVSR_CONFIG=configs/LRS3_V_WER19.1.ini
# export TELEPATHY_AUTOAVSR_DETECTOR=mediapipe
# export TELEPATHY_AUTOAVSR_DEVICE=mps    # default: mps; options: auto|mps|cpu|cuda:0
# export TELEPATHY_AUTOAVSR_GPU_IDX=-1

The server now loads AutoAVSR weights during startup and keeps the model warm in-process for subsequent requests. The app sends an encoded videoDataUrl clip to /transcribe, and polls /health to wait until the model is ready.

Model candidates (English)

LipNet (GRID‑style sentence lipreading).
AV‑HuBERT / fairseq (supports video‑only or AV fine‑tuning).
Visual Speech Recognition for Multiple Languages (LRS2/LRS3/GRID models).
deep_avsr (LRS2/LRS3 video‑only baselines).

Notes

This starter sends short encoded video clips (videoDataUrl) to the backend.
For production, enable a CSP instead of leaving it disabled in src-tauri/tauri.conf.json.
If you see navigator.mediaDevices.getUserMedia missing, rebuild/restart the Tauri app and allow camera access when prompted. On macOS, verify Camera permission in System Settings -> Privacy & Security -> Camera.
If paste automation fails, verify Accessibility permission in System Settings -> Privacy & Security -> Accessibility.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.claude		.claude
.vscode		.vscode
public		public
scripts		scripts
server		server
src-tauri		src-tauri
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.cjs		postcss.config.cjs
pyproject.toml		pyproject.toml
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
uv.lock		uv.lock
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telepathy

Quick start

Model backend (AutoAVSR)

Model candidates (English)

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Telepathy

Quick start

Model backend (AutoAVSR)

Model candidates (English)

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages