Labels: hard enhancement research frontend gssoc-2026
The current OCR engine uses Tesseract via pytesseract — a Python subprocess. Build an alternative OCR worker that runs entirely in the browser using Tesseract.wasm, enabling Execra's frontend to perform OCR on screen captures without making a network call to the backend.
What you'll code:
- Create
frontend/workers/ocr_worker.js (Web Worker):
- Load
tesseract.js (NPM package tesseract.js@5) inside the worker
- Initialize the worker on load:
await createWorker("eng") — download English language data once and cache it in IndexedDB
- Message handler: receives
{type: "recognize", imageData: ImageData, id: string} from the main thread
- Returns
{type: "result", id: string, text: string, confidence: number, words: [{text, bbox, confidence}]}
- Returns
{type: "error", id: string, error: string} on failure
- Performance: process a 1920×1080 frame in under 800ms on a modern laptop
- Create
frontend/utils/ocr_client.js:
OCRClient class wrapping the Web Worker with a Promise-based API:
recognize(imageData: ImageData) -> Promise<OCRResult> — sends work to worker, returns promise resolved when worker replies (matched by id UUID)
isReady() -> boolean — tracks worker initialization state
terminate() — shuts down the worker
- Integration in
frontend/renderer/app.js:
- When backend WebSocket is disconnected, fall back to local OCR for basic screen text display in the overlay
- Add a status indicator in the overlay:
"OCR: Local (offline)" vs "OCR: Backend (online)"
- Document browser compatibility matrix in
docs/browser_ocr_compatibility.md (Chrome 88+, Firefox 79+, Edge 88+)
- Write Jest unit tests for
OCRClient: mock Worker, test Promise resolution, test error handling
Skills needed: JavaScript · Web Workers · WebAssembly · tesseract.js · IndexedDB · Promise patterns · Jest
👉 Claim this issue on GitHub →
Labels:
hardenhancementresearchfrontendgssoc-2026The current OCR engine uses Tesseract via
pytesseract— a Python subprocess. Build an alternative OCR worker that runs entirely in the browser using Tesseract.wasm, enabling Execra's frontend to perform OCR on screen captures without making a network call to the backend.What you'll code:
frontend/workers/ocr_worker.js(Web Worker):tesseract.js(NPM packagetesseract.js@5) inside the workerawait createWorker("eng")— download English language data once and cache it in IndexedDB{type: "recognize", imageData: ImageData, id: string}from the main thread{type: "result", id: string, text: string, confidence: number, words: [{text, bbox, confidence}]}{type: "error", id: string, error: string}on failurefrontend/utils/ocr_client.js:OCRClientclass wrapping the Web Worker with a Promise-based API:recognize(imageData: ImageData) -> Promise<OCRResult>— sends work to worker, returns promise resolved when worker replies (matched byidUUID)isReady() -> boolean— tracks worker initialization stateterminate()— shuts down the workerfrontend/renderer/app.js:"OCR: Local (offline)"vs"OCR: Backend (online)"docs/browser_ocr_compatibility.md(Chrome 88+, Firefox 79+, Edge 88+)OCRClient: mock Worker, test Promise resolution, test error handlingSkills needed: JavaScript · Web Workers · WebAssembly ·
tesseract.js· IndexedDB · Promise patterns · Jest👉 Claim this issue on GitHub →