Amara

Turn documents into audiobooks. Upload a PDF, DOCX, TXT, or Markdown file and get back a high-quality MP3 — powered by your choice of TTS provider.

Features

Multiple file formats — PDF (with OCR fallback for scanned docs), DOCX, TXT, Markdown
4 TTS providers — Edge TTS (free), Google TTS (free), OpenAI TTS, ElevenLabs
Script formatting — cleans up extraction noise and optionally rewrites text for natural listening via OpenAI
Async processing — upload and poll; no blocking the UI during long conversions
Conversion history — play, download, or delete past conversions

Prerequisites

Requirement	Notes
Python 3.11+
Node.js 18+
PostgreSQL	Any recent version
ffmpeg + ffprobe	Must be on PATH — used for audio stitching and duration
Poppler	Optional — only needed for scanned/image-based PDFs
Tesseract OCR	Optional — only needed for scanned/image-based PDFs

Quick Start

1. Clone and set up the backend

git clone https://github.com/Otitodev/amara.git
cd amara/backend

python -m venv venv
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

pip install -r requirements.txt

2. Create the database

createdb amara

3. Configure environment

cp .env.example .env
# Edit .env — at minimum set DATABASE_URL

4. Start the backend

uvicorn main:app --reload --port 8000

The server creates the database tables automatically on first start.

5. Start the frontend

cd ../frontend
npm install
npm run dev

Open http://localhost:5173.

TTS Providers

Provider	Free	API Key	Quality	Notes
`edge`	Yes	No	Good	Microsoft Edge TTS — default
`gtts`	Yes	No	Basic	Google Translate TTS
`openai`	No	`OPENAI_API_KEY`	Excellent	OpenAI `tts-1` model
`elevenlabs`	10k chars/mo	`ELEVENLABS_API_KEY`	Best	Multilingual, natural voices

Set the default in .env with TTS_PROVIDER=edge. Users can override per-upload in the UI.

Configuration

All settings live in .env (copy from .env.example):

Variable	Default	Description
`DATABASE_URL`	`postgresql://postgres:postgres@localhost:5432/amara`	PostgreSQL connection string
`TTS_PROVIDER`	`edge`	Default TTS provider (`edge`, `gtts`, `openai`, `elevenlabs`)
`TTS_VOICE`	`en-US-AriaNeural`	Default voice for the configured provider
`OPENAI_API_KEY`	(empty)	Required for `openai` TTS and OpenAI script formatting
`ELEVENLABS_API_KEY`	(empty)	Required for `elevenlabs` TTS
`SCRIPT_MODE`	`audiobook`	Default script mode (`audiobook` or `faithful`)
`SCRIPT_FORMATTER_PROVIDER`	`openai`	`openai` uses LLM rewriting; anything else uses local regex cleanup
`SCRIPT_FORMATTER_MODEL`	`gpt-4o-mini`	OpenAI model used for script formatting
`MAX_FILE_SIZE_MB`	`20`	Upload size limit
`AUDIO_DIR`	`backend/audio_files`	Where generated MP3s are stored

Project Structure

amara/
├── backend/
│   ├── main.py                  # FastAPI app
│   ├── config.py                # Settings (pydantic-settings)
│   ├── models.py                # SQLAlchemy Conversion model
│   ├── schemas.py               # Pydantic response schemas
│   ├── routers/
│   │   ├── uploads.py           # POST /api/upload
│   │   └── jobs.py              # GET/DELETE /api/jobs, GET /api/audio
│   ├── services/
│   │   ├── extractor.py         # PDF / DOCX / text extraction
│   │   ├── pipeline.py          # Orchestrates extract → format → TTS → store
│   │   ├── storage.py           # Local disk audio storage
│   │   ├── formatter/           # Script cleanup / OpenAI rewriting
│   │   └── tts/                 # TTS provider implementations
│   ├── migrations/              # SQL migrations (auto-applied on startup)
│   └── tests/
├── frontend/
│   └── src/
│       ├── api.js
│       ├── App.jsx
│       └── components/
│           ├── UploadForm.jsx
│           ├── JobStatus.jsx
│           ├── ConversionHistory.jsx
│           └── AudioPlayer.jsx
├── .env.example
└── README.md

Development

# Backend tests
cd backend
python -m unittest discover tests

# Lint / format
ruff check .
ruff format .

# Frontend lint
cd frontend
npm run lint

Contributing

Pull requests are welcome. For large changes, open an issue first to discuss the approach. Please run the backend tests and linter before submitting.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amara

Features

Prerequisites

Quick Start

1. Clone and set up the backend

2. Create the database

3. Configure environment

4. Start the backend

5. Start the frontend

TTS Providers

Configuration

Project Structure

Development

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Amara

Features

Prerequisites

Quick Start

1. Clone and set up the backend

2. Create the database

3. Configure environment

4. Start the backend

5. Start the frontend

TTS Providers

Configuration

Project Structure

Development

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages