fix: retry with token-level truncation on ONNX OOM in embedding worker by BYK · Pull Request #457 · BYK/loreai

BYK · 2026-05-22T18:49:14Z

Summary

Fixes ONNX Runtime OOM errors (e.g. 284432024) that occur when embedding dense-token content like code, base64, CJK text, or JSON with short keys, where the chars-per-token ratio is much lower than the ~4 assumed for English prose.

Problem

The existing character-level pre-truncation (LOCAL_MAX_CHARS = 16384, assuming ~4 chars/token) can produce 6000-8000+ tokens for dense content — well above the ~4096 safe threshold for ONNX inference. The FeatureExtractionPipeline in transformers.js hardcodes { truncation: true } without forwarding max_length, so its built-in truncation caps at the model max (8192 tokens), which still OOMs.

Fix

Instead of aggressively pre-truncating every request, the worker now:

Tries inference at full length first — normal English text passes through untouched
On OOM, catches the error using the existing isOomError() detector
Retries with token-level truncation using the pipeline's actual tokenizer (encode → slice → decode), progressively halving the limit: full → 4096 → 2048 → 1024 tokens

This preserves maximum semantic content for normal texts while adaptively handling dense-token edge cases.

Changes

Extracts runInference() helper from processEmbed() for retry-ability
Adds truncateTexts() that uses the real tokenizer for exact content-token counting (add_special_tokens: false to exclude [CLS]/[SEP])
Stashes a reference to pipe.tokenizer during pipeline init
Adds OOM retry loop (1 initial attempt + 3 truncated retries) with halved token limits
Logs console.warn on each retry for observability
Updates LOCAL_MAX_CHARS comment to reference the new worker-level defense

BYK self-assigned this May 22, 2026

BYK force-pushed the fix-onnx-oom-retry branch from c34c862 to 11cb536 Compare May 22, 2026 19:04

fix: retry with token-level truncation on ONNX OOM in embedding worker

c412285

BYK force-pushed the fix-onnx-oom-retry branch from 11cb536 to c412285 Compare May 22, 2026 19:06

BYK merged commit 9cd94ec into main May 22, 2026
7 checks passed

BYK deleted the fix-onnx-oom-retry branch May 22, 2026 19:09

craft-deployer Bot mentioned this pull request May 22, 2026

publish: BYK/loreai@0.24.1 #458

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: retry with token-level truncation on ONNX OOM in embedding worker#457

fix: retry with token-level truncation on ONNX OOM in embedding worker#457
BYK merged 1 commit into
mainfrom
fix-onnx-oom-retry

BYK commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Fix

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BYK commented May 22, 2026 •

edited

Loading