Build a WebAssembly-Based OCR Worker for Browser Compatibility

**Labels:** `hard` `enhancement` `research` `frontend` `gssoc-2026`

The current OCR engine uses Tesseract via `pytesseract` — a Python subprocess. Build an alternative OCR worker that runs entirely in the browser using Tesseract.wasm, enabling Execra's frontend to perform OCR on screen captures without making a network call to the backend.

**What you'll code:**
- Create `frontend/workers/ocr_worker.js` (Web Worker):
  - Load `tesseract.js` (NPM package `tesseract.js@5`) inside the worker
  - Initialize the worker on load: `await createWorker("eng")` — download English language data once and cache it in IndexedDB
  - Message handler: receives `{type: "recognize", imageData: ImageData, id: string}` from the main thread
  - Returns `{type: "result", id: string, text: string, confidence: number, words: [{text, bbox, confidence}]}`
  - Returns `{type: "error", id: string, error: string}` on failure
  - Performance: process a 1920×1080 frame in under 800ms on a modern laptop
- Create `frontend/utils/ocr_client.js`:
  - `OCRClient` class wrapping the Web Worker with a Promise-based API:
    - `recognize(imageData: ImageData) -> Promise<OCRResult>` — sends work to worker, returns promise resolved when worker replies (matched by `id` UUID)
    - `isReady() -> boolean` — tracks worker initialization state
    - `terminate()` — shuts down the worker
- Integration in `frontend/renderer/app.js`:
  - When backend WebSocket is disconnected, fall back to local OCR for basic screen text display in the overlay
  - Add a status indicator in the overlay: `"OCR: Local (offline)"` vs `"OCR: Backend (online)"`
- Document browser compatibility matrix in `docs/browser_ocr_compatibility.md` (Chrome 88+, Firefox 79+, Edge 88+)
- Write Jest unit tests for `OCRClient`: mock Worker, test Promise resolution, test error handling

**Skills needed:** JavaScript · Web Workers · WebAssembly · `tesseract.js` · IndexedDB · Promise patterns · Jest

**👉 [Claim this issue on GitHub →](https://github.com/sahoo-tech/execra/issues)**

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build a WebAssembly-Based OCR Worker for Browser Compatibility #136

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Build a WebAssembly-Based OCR Worker for Browser Compatibility #136

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions