Skip to content

Abu/multimodal#16

Merged
DIodide merged 3 commits intostagingfrom
abu/multimodal
Mar 23, 2026
Merged

Abu/multimodal#16
DIodide merged 3 commits intostagingfrom
abu/multimodal

Conversation

@abubahmed
Copy link
Copy Markdown
Collaborator

  • Added multimodal support for PDFs, audio files, and images
  • Hardcoded modals that can support each type (OpenRouter doesn't provide this info)
  • All media types are stored in storage, but OpenRouter doesn't take URLs for audio (passed in as base64 encoded data)
  • Copy paste, drag and drop, and file upload support
  • For audio, users can record audio directly into the website

@DIodide
Copy link
Copy Markdown
Owner

DIodide commented Mar 23, 2026

@claude review this

@DIodide DIodide self-requested a review March 23, 2026 10:42
@claude
Copy link
Copy Markdown

claude bot commented Mar 23, 2026

Claude finished @DIodide's task in 2m 7s —— View job


Code Review: Abu/multimodal PR

  • Gather context
  • Read changed files
  • Analyze code for bugs, security issues, and best practices
  • Post review feedback

Overall this is a clean implementation. The hook/component separation is good, URL revocation is handled correctly, and the Convex auth gates are in place. A few bugs and a couple of security/UX concerns worth addressing before merge:


Bugs

1. Can't send attachments-only (no text)

apps/web/src/routes/chat/index.tsx:2723

if (!content || !activeHarness) return;

If a user drops an image and hits send without typing any text, the message is silently blocked. The guard should be if ((!content && readyAttachments.length === 0) || !activeHarness), but readyAttachments is computed after this check — the condition needs restructuring. Fix this →

2. MediaRecorder throws (not caught) on Safari/iOS

apps/web/src/routes/chat/index.tsx:2651

const recorder = new MediaRecorder(stream, { mimeType: "audio/webm" });

Safari doesn't support audio/webm. The MediaRecorder constructor throws a NotSupportedError synchronously, but the catch block on line 2665 only handles getUserMedia denials (based on the toast message). On Safari, users will hit an unhandled error instead of a clear message. Use MediaRecorder.isTypeSupported("audio/webm") before passing the mimeType, and fall back to audio/mp4 (which Safari supports). Fix this →

3. MediaRecorder stream leak on component unmount

apps/web/src/routes/chat/index.tsx:2645-2673

If a user navigates away mid-recording, mediaRecorderRef is never stopped and the mic stream is never released. A useEffect cleanup returning () => { mediaRecorderRef.current?.stop(); } is needed. Fix this →

4. Attachments silently dropped when message is enqueued during streaming

apps/web/src/routes/chat/index.tsx:2728-2733

clearAttachments();
if (isStreaming && conversationId) {
    onEnqueue(content);  // attachments are gone, only text enqueued
    return;
}

If the user sends while streaming is active, attachments are cleared but the queued message only carries text. The attachments are lost silently. This probably needs a design decision, but at minimum users should be warned or the enqueue path should not clear attachments.


Security

5. No server-side file size enforcement

packages/convex-backend/convex/files.ts:9-15

generateUploadUrl returns an upload URL with no size constraints. The 10/20/25 MB limits are client-side only and trivially bypassed. Convex doesn't natively expose byte limits on presigned URLs, but you could add a post-upload validation step (check ctx.storage.getMetadata after upload and delete if oversized), or document this as an accepted risk.

6. getFileUrl doesn't verify file ownership

packages/convex-backend/convex/files.ts:22-29

Any authenticated user who knows a storageId can resolve its URL. Since storageIds are opaque IDs and aren't guessable in practice this is low-risk, but worth noting. A stricter implementation would cross-reference the storageId against the messages table to verify the requesting user owns a message containing that file.


Performance

7. resolveSignedUrls called for every historical attachment on every send

apps/web/src/routes/chat/index.tsx:2779-2785

Every time a user sends a message, the code re-fetches signed URLs for all attachments in the entire conversation history. For a long conversation with many images, this fires many Convex queries in series. Consider caching resolved URLs (they're already being fetched reactively for display in MessageAttachments) or passing them through from the displayed messages instead of re-resolving.


Minor / Nits

8. modelSupportsMedia only checks for image

apps/web/src/lib/models.ts:36-38

export function modelSupportsMedia(model: string | undefined): boolean {
    return modelHas(model, "image");
}

The name implies "any media" but it only gates on image support. This means a hypothetical audio-only model would not show the attachment UI at all. Consider renaming to modelSupportsImages or broadening the check to image || pdf || audio for the attachment button visibility. Looking at the call site in index.tsx, supportsAnyAttachment is used for the attachment button — verify this correctly uses allowedMimeTypes(model).size > 0 rather than modelSupportsMedia.

9. buildMultimodalContent returns [] when all attachments are unrecognised types

apps/web/src/lib/multimodal.ts:61-63

If somehow a file with a MIME type that doesn't match any branch ends up in atts (e.g. video/mp4), it's silently skipped and the function may return an empty array. Since MIME validation happens upstream this shouldn't occur, but a console.warn or explicit default case would make debugging easier.


Copy link
Copy Markdown
Owner

@DIodide DIodide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DIodide DIodide merged commit 943e896 into staging Mar 23, 2026
2 checks passed
@DIodide DIodide deleted the abu/multimodal branch March 23, 2026 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants