VoxLens

A dark-first React dashboard paired with a FastAPI scaffold for OCR-driven dialogue capture, relay-model cleanup, and local persona-based TTS playback.

Frontend

The app includes:

snipping-tool-style drawable capture zones
multi-display zone targeting
per-game profile management
character vault with editable archetypes
mood and traits controls
relay AI settings for local models and API keys
live OCR preview, relay preview, and TTS status indicators

The browser client connects to ws://localhost:8000/ws by default. To override that, set VITE_OVERLAY_SOCKET_URL. For REST settings and profile calls, the default backend base URL is http://localhost:8000 and can be overridden with VITE_BACKEND_HTTP_URL.

Backend scaffold

The backend/ folder contains a FastAPI service with:

GET /health for service and dependency status
GET /screens for multi-monitor capture metadata
GET /profiles, POST /profiles, DELETE /profiles/{name} for profile storage
GET /settings, PUT /settings for local-model and API-key configuration
POST /tts for local TTS requests
WS /ws for bidirectional zone updates, OCR text events, relay events, status updates, and audio bytes

Dual AI flow

The speech pipeline is now:

OCR extracts nameplate + dialog text
a relay AI cleans OCR mistakes and rewrites the line into a more natural speakable sentence
the TTS engine receives that cleaned line along with persona prompt context

Relay providers currently supported:

heuristic — fully local fallback cleanup without any external model
openai-compatible — works with local or remote chat-completions endpoints, including local OpenAI-compatible model servers

The TTS engine remains local Qwen3-TTS in the current scaffold, with saved settings for model name/path.

Python packages

Install the packages listed in backend/requirements.txt:

fastapi
uvicorn[standard]
websockets
pydantic
numpy
mss
paddleocr
qwen3_tts

Running the backend

Create a Python environment for the backend/ folder.
Install the requirements from backend/requirements.txt.
Start the FastAPI app with Uvicorn, targeting backend.main:app on port 8000.

If mss, PaddleOCR, Qwen3-TTS, or your relay model endpoint are not ready yet, the scaffold still starts and falls back to demo OCR text plus heuristic relay cleanup and silent WAV output so the UI flow can be tested safely.

Notes on capture and safety

Screen capture uses mss only.
No memory reads or injected hooks are used in the backend scaffold.
Zones are stored as percentages and converted to pixels per selected monitor at runtime.
API keys are stored in backend/settings.json and are not returned to the frontend once saved; the UI only receives whether a key is already configured.

Adding archetypes

Archetypes can be added in two ways:

through the Character Vault UI in the frontend
by editing backend/profiles.json and adding a new { id, name, basePrompt } entry inside a profile

Current status

Implemented now:

dark mode default theme
multi-display overlay dashboard UI
profile and archetype management
relay AI settings with local model/API key support
WebSocket client helper
backend FastAPI scaffold with OCR -> relay -> TTS flow
Vitest coverage for zone math and profile serialization

Still recommended next:

tune PaddleOCR preprocessing for your target games
connect the saved Qwen3-TTS model path directly into your exact local loader contract
add alias-based fuzzy persona matching beyond exact nameplate matches

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
backend		backend
electron		electron
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
AI_RULES.md		AI_RULES.md
README.md		README.md
components.json		components.json
eslint.config.js		eslint.config.js
index.html		index.html
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vercel.json		vercel.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoxLens

Frontend

Backend scaffold

Dual AI flow

Python packages

Running the backend

Notes on capture and safety

Adding archetypes

Current status

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoxLens

Frontend

Backend scaffold

Dual AI flow

Python packages

Running the backend

Notes on capture and safety

Adding archetypes

Current status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages