fix: pre-truncate long texts to prevent ONNX OOM and report embedding errors to Sentry#343
Merged
Conversation
… errors to Sentry Pre-truncate texts to ~4096 tokens (LOCAL_MAX_CHARS=16384) before sending to the ONNX worker. The Nomic v1.5 model supports 8192 tokens max, but ONNX runtime OOMs on inputs near that ceiling (error codes 284432024, 287180544, 144786472). nextBatch() always includes at least 1 item, so the MAX_BATCH_TOKEN_AREA guard was bypassed for single long texts. Upgrade embedding error reporting from log.info to log.error so failures reach Sentry via captureException: - Worker init-error handler (embedding.ts) - Worker crash/exit handlers (embedding.ts, previously had no logging) - Fire-and-forget embedding catches for knowledge/distillation/temporal - Top-level startup backfill catch (pipeline.ts)
BYK
added a commit
that referenced
this pull request
May 15, 2026
…air truncation (#344) ## Summary Follow-up to #343. Addresses Sentry noise and a surrogate pair edge case found during self-review. - Add `isAvailable()` guard to fire-and-forget embedding functions to short-circuit when provider is broken - Break backfill loops on `LocalProviderUnavailableError` to avoid O(batches) Sentry events per startup - Extract `safeLocalTruncate()` helper that avoids splitting UTF-16 surrogate pairs at the truncation boundary ## Problem PR #343 upgraded `log.info` to `log.error` in fire-and-forget embedding catches (`embedKnowledgeEntry`, `embedDistillation`, `embedTemporalMessage`). But when the local provider is broken, **every single call** to these functions would throw and fire `log.error` → `captureException()` — potentially 50-200+ Sentry events per session. Similarly, the backfill loops would retry every batch even after the first one fails with `LocalProviderUnavailableError`, producing O(items/batchSize) Sentry events on startup. The `String.slice()` truncation could also split a UTF-16 surrogate pair (emoji, CJK supplementary chars), producing an invalid lone surrogate passed to the tokenizer. ## Changes | Location | Fix | |---|---| | `embedKnowledgeEntry()` | Add `if (!isAvailable()) return;` early exit | | `embedDistillation()` | Add `if (!isAvailable()) return;` early exit | | `embedTemporalMessage()` | Add `if (!isAvailable()) return;` early exit | | `backfillEmbeddings()` catch | `break` on `LocalProviderUnavailableError` | | `backfillDistillationEmbeddings()` catch | `break` on `LocalProviderUnavailableError` | | `LocalProvider.embed()` | Use `safeLocalTruncate()` helper instead of raw `slice()` |
This was referenced May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LOCAL_MAX_CHARS=16384chars) inLocalProvider.embed()before sending to the ONNX worker, preventing OOM on single inputs near the model's 8192-token max (error code284432024)log.info(or no logging) tolog.errorso failures reach Sentry viacaptureExceptionProblem
ONNX OOM: A single text tokenized to 8192 tokens (the model's max sequence length) caused ONNX runtime allocation failure.
nextBatch()always includes at least 1 item regardless ofMAX_BATCH_TOKEN_AREA, so the budget guard was bypassed. The worker'struncation: truecaps at the model max, but that's already too large for ONNX to allocate.Silent Sentry: Embedding errors were invisible in Sentry because:
pipeline.ts:591— top-level backfill catch usedlog.infoembedding.ts:335— worker init failure usedlog.infoembedding.ts:853,873,898— fire-and-forget catches usedlog.infoembedding.ts:351-359,361-373— worker crash/exit handlers had no logging at allOnly
log.error()callssink?.captureException(err)via the Sentry bridge.Changes
packages/core/src/embedding.tsLOCAL_MAX_CHARSconstant and pre-truncation inLocalProvider.embed()packages/core/src/embedding.tsinit-error,crash,exithandlers tolog.errorpackages/core/src/embedding.tslog.errorpackages/gateway/src/pipeline.tslog.error