fix: streaming robustness, RRF list priority, and CLI suggestion quality by BYK · Pull Request #268 · BYK/loreai

BYK · 2026-05-12T20:22:33Z

Summary

Follow-up fixes from self-review of PR #267. Addresses 2 critical, 2 moderate, and 1 minor issue found during code review.

Critical: Streaming translator robustness (C1, C2)

C1 — Responses API error handling: The translateAnthropicStreamToResponses translator silently closed the stream on upstream errors, leaving clients hanging with no terminal event. Now emits a response.failed event with error details before closing.

C2 — Client disconnect handling: Neither streaming translator had a cancel() handler. If a client disconnected mid-stream, the upstream reader continued consuming the full response into the void, wasting bandwidth. Both translators now:

Implement cancel() to cancel the upstream Response.body reader
Use a safeEnqueue() pattern that sets a cancelled flag on enqueue failure
Check cancelled at the top of each loop iteration to break early

Moderate: CLI suggestion quality (M4)

The "did you mean?" matching used a 2-character prefix check, which was far too loose (e.g., lore hi would suggest help). Replaced with Levenshtein distance with a max-distance threshold of max(2, floor(len/2)), giving accurate suggestions for typos.

Moderate: RRF list cap priority (m4)

With query expansion producing 3 queries, the per-query BM25 lists (9+) filled the MAX_RRF_LISTS=10 cap before vector search, lat.md, cross-project, quality re-ranking, and exact-match boost lists were added — dropping all high-value supplemental lists. Fixed by tracking list boundaries:

Primary lists (original query BM25 + recency): always kept
Supplemental lists (vector, lat.md, cross-project, quality, exact-match): always kept
Expanded-query lists: trimmed first when over budget

Minor: Documentation (M6)

Added a comment documenting that /v1/models passthrough only supports the Anthropic upstream.

Verification

Typecheck: all 4 packages pass
Tests: 1254 pass, 0 fail

- Add cancel() handlers to both OpenAI streaming translators to stop upstream reads on client disconnect (C2) - Emit response.failed event in Responses API translator on error instead of silently closing the stream, preventing client hangs (C1) - Use safeEnqueue pattern in both translators to gracefully handle enqueue-after-cancel without throwing (C2) - Replace naive 2-char prefix CLI suggestion matching with Levenshtein distance for accurate 'did you mean?' suggestions (M4) - Fix RRF list cap to trim expanded-query lists first, preserving high-value vector, lat.md, cross-project, and exact-match lists (m4) - Document /v1/models endpoint Anthropic-only limitation (M6)

BYK merged commit bdf8b60 into main May 12, 2026
7 checks passed

BYK deleted the fix/streaming-robustness-and-review-fixes branch May 12, 2026 20:26

This was referenced May 13, 2026

publish: BYK/loreai@0.18.0 #294

Closed

publish: BYK/loreai@0.18.0 #296

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: streaming robustness, RRF list priority, and CLI suggestion quality#268

fix: streaming robustness, RRF list priority, and CLI suggestion quality#268
BYK merged 1 commit into
mainfrom
fix/streaming-robustness-and-review-fixes

BYK commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 12, 2026

Summary

Critical: Streaming translator robustness (C1, C2)

Moderate: CLI suggestion quality (M4)

Moderate: RRF list cap priority (m4)

Minor: Documentation (M6)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant