GitHub

ReadAloud (Kokoro TTS + CTC alignment on Runpod Serverless)

Personal reader that converts PDFs/text into synchronized audio with word-level highlighting.

Structure

frontend/               # Next.js app (App Router)
src/
  serverless/
    handler/            # Runpod handler (single entry)
      main.py
      models/           # baked model assets (in image)
      utils/
  shared/
    contracts/          # shared I/O contracts (TS + Python)
    utils/
docs/
  architecture.md

Core endpoints (via Runpod handler)

health
prepare_document → returns cleaned paragraphs
synthesize_chunk → returns audio + word timings (via Wav2Vec2 + ctc-segmentation)

Notes

Models are baked into the Docker image to avoid cold-start downloads.
No server-side storage; results are returned to the client.
UI: minimalist black/white; no emojis in logs.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
docs		docs
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
entrypoint.sh		entrypoint.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReadAloud (Kokoro TTS + CTC alignment on Runpod Serverless)

Structure

Core endpoints (via Runpod handler)

Notes

About

Uh oh!

Releases

Packages

Languages

calledforth/readaloud

Folders and files

Latest commit

History

Repository files navigation

ReadAloud (Kokoro TTS + CTC alignment on Runpod Serverless)

Structure

Core endpoints (via Runpod handler)

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages