fix: guard quantiseAndSend against zero-length sequence and hidden dim#78
Conversation
Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com> Agent-Logs-Url: https://github.com/devlux76/q2/sessions/6abf842b-add1-42e8-b3f7-5a6e525a3b75
There was a problem hiding this comment.
Pull request overview
Adds input validation to quantiseAndSend() in the worker to avoid unsafe/invalid WASM kernel calls when the embedding shape is degenerate (e.g., zero-length sequences).
Changes:
- Add early-return guard for
seqLen < 1with a warning log. - Add early-return guard for
hiddenDim < 1with a warning log.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (seqLen < 1) { | ||
| workerLog('warn', 'quantiseAndSend: seqLen < 1; skipping Q² quantisation', { seqLen }); | ||
| return; | ||
| } | ||
| if (hiddenDim < 1) { | ||
| workerLog('warn', 'quantiseAndSend: hiddenDim < 1; skipping Q² quantisation', { hiddenDim }); | ||
| return; | ||
| } |
There was a problem hiding this comment.
The new guards prevent the seqLen=0/hiddenDim=0 trap, but quantiseAndSend still doesn’t validate other documented kernel preconditions (e.g., n must be a power of 2 and ≤ 16,384, and the buffer must contain at least the last-token row). Without these checks it’s still possible to call into the WASM kernel with unsupported dimensions or an undersized buffer, which can lead to traps or incorrect fingerprints. Consider extending the early validation to enforce the kernel’s n constraints (and, if applicable, the expected byteLength for the given dtype) before copying/calling into WASM.
| if (seqLen < 1) { | ||
| workerLog('warn', 'quantiseAndSend: seqLen < 1; skipping Q² quantisation', { seqLen }); | ||
| return; | ||
| } | ||
| if (hiddenDim < 1) { | ||
| workerLog('warn', 'quantiseAndSend: hiddenDim < 1; skipping Q² quantisation', { hiddenDim }); | ||
| return; | ||
| } |
There was a problem hiding this comment.
This change introduces new early-return behavior for invalid seqLen/hiddenDim, but there’s no unit test asserting that the Q² kernel is not invoked in these cases. Since the repo already has worker unit tests, consider adding a test that triggers the embedding extraction path with seqLen=0 and/or hiddenDim=0 and verifies getKernel/quantise are not called (and that the worker continues without error).
quantiseAndSend()had no input validation before calling into the WASM kernel. WithseqLen=0, the kernel computeslastTokenPos = seqLen - 1 = -1, which wraps to a large address in WASM linear memory — an invalid read.Changes
src/worker.ts— Added early-return guards at the top ofquantiseAndSend()for bothseqLen < 1andhiddenDim < 1, logging a warning and skipping the kernel call:📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.