-
Notifications
You must be signed in to change notification settings - Fork 11
🤖 feat: Drag-and-drop image support + fixes #308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Images were being saved to history but not transmitted to the AI model. **Root cause:** - Our CmuxImagePart used type:'image' with fields 'image' and 'mimeType' - AI SDK's convertToModelMessages() only processes type:'file' parts - Images were filtered out before reaching the model **Fix:** - Changed CmuxImagePart to match AI SDK's FileUIPart format: - type: 'image' → 'file' - image → url - mimeType → mediaType - Updated all references across frontend, backend, and type definitions - Updated message aggregation to filter for type:'file' instead of type:'image' **Files changed:** - Types: message.ts, ipc.ts - Backend: agentSession.ts, ipcMain.ts, StreamingMessageAggregator.ts, modelMessageTransform.ts - Frontend: ChatInput.tsx, ImageAttachments.tsx, UserMessage.tsx - Stories: UserMessage.stories.tsx Images now flow correctly from UI → history → AI model.
- Test image transmission to AI model and response - Test image persistence in chat history - Uses 1x1 pixel PNG as minimal test fixture - Verifies both Anthropic and OpenAI providers
Reduces duplication and net LoC: - Add waitForStreamSuccess() - combines create collector + wait + assert - Add readChatHistory() - reads and parses chat.jsonl - Add TEST_IMAGES constant - reusable 1x1 pixel fixtures Image tests now: - 36 lines shorter (removed boilerplate) - More declarative and readable - Easier to add similar tests in future Net change: -36 lines in sendMessage.test.ts, +39 in helpers.ts (+3 total)
Consolidates repeated patterns: - createEventCollector + waitForEvent + assertStreamSuccess - Now just: await waitForStreamSuccess() Reduces 3 lines to 1 in multiple tests for: - bash tool tests - conversation continuity test - additional system instructions test Net: -8 lines
- Import fs/promises correctly in readChatHistory - Add type annotation for line parameter - Cast deltas to StreamDeltaEvent for textDelta access - Add proper null checks for userMessage and imagePart
- Extract image handling utilities to src/utils/imageHandling.ts - Unified paste and drop logic for cleaner code - processImageFiles handles async conversion to base64 - extractImagesFromClipboard/Drop filter image files - Extract toast utilities to src/components/ChatInputToasts.tsx - createCommandToast and createErrorToast - Removed 190 lines from ChatInput.tsx (1072 → 882, -17.7%) - Add drag-and-drop support to ChatInput - onDragOver handler checks for Files and sets dropEffect - onDrop handler processes dropped images - Works alongside existing paste support - Add comprehensive unit tests - src/utils/imageHandling.test.ts covers all utilities - Mock FileReader for Node.js test environment - 10 tests, all passing Users can now drag images directly into the chat input instead of only pasting them.
When image parts fail validation, errors now include:
- Index of the failing image part
- Type of the invalid field (got typeof X)
- First 50-200 chars of actual data received
- Specific check that failed (url, data URL format, mediaType)
Frontend validation logs errors to console before sending,
making it easier to catch issues client-side.
Backend validation provides detailed context in assertion messages,
making it clear what was received vs. what was expected.
Example new error:
"image part [0] must include url string content (got undefined): {\"id\":\"...\"}"
vs old error:
"image part must include base64 string content"
Some browsers/OS combinations (e.g., macOS drag-and-drop) don't populate file.type for dragged files. This causes mediaType to be empty string, which fails validation in the AI SDK. Solution: Fall back to detecting MIME type from file extension when file.type is empty. Defaults to image/png if extension is unrecognized. Supported extensions: png, jpg, jpeg, gif, webp, bmp, svg
- Remove unused ParsedCommand import - Fix async event handler warnings (use void with .then()) - Fix prefer-nullish-coalescing warnings (?? instead of ||) - Fix consistent-type-assertions in tests (use as at call site) - Remove unused beforeEach import
CI has stricter TypeScript settings - need to cast through unknown when mocking complex types like DataTransfer.
ammario
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually verified — macOS
|
cc @kylecarbs if your image paste breaks it's probably because of this — although I'm pretty sure it won't |
## Summary
Adds drag-and-drop support for images in the chat input, refactors
ChatInput.tsx for maintainability, and fixes image handling bugs.
---
## New Feature: Drag-and-Drop Images
Users can now drag image files directly into the chat input box
alongside existing paste (Ctrl+V/Cmd+V) functionality.
**How it works:**
- Drag image file(s) over the chat input textarea
- Drop indicator shows (cursor changes to copy icon)
- Images convert to base64 and appear as thumbnails
- Multiple images supported in one operation
---
## Code Refactoring
Reduced ChatInput.tsx from 1072 → 882 lines (-190 lines, -17.7%):
**New Files:**
- `src/utils/imageHandling.ts` (73 lines) - Image conversion utilities
- `src/components/ChatInputToasts.tsx` (200 lines) - Toast creation
logic
- `src/utils/imageHandling.test.ts` (180 lines) - 10 comprehensive tests
**Benefits:**
- Unified logic for paste and drop (no duplication)
- Async/await instead of nested FileReader callbacks
- Toast utilities separated from UI component
- Easy to test in isolation
---
## Bug Fixes
### 1. macOS Drag-and-Drop MIME Type Issue
**Problem:** Dragging PNG files from macOS Finder failed with "image
part must include base64 string content"
**Root Cause:** macOS doesn't populate `file.type` for dragged files,
resulting in empty mediaType
**Solution:** Added MIME type detection fallback:
- Primary: Use `file.type` if available
- Fallback: Detect from file extension (png, jpg, jpeg, gif, webp, bmp,
svg)
- Default: Use `image/png` if unrecognized
### 2. Better Error Messages for Image Validation
**Old:** `"image part must include base64 string content"`
**New:** `"image part [0] must include url string content (got
undefined): {\"id\":\"...\"}"`
Errors now include:
- Index of failing image
- Type of invalid field
- First 50-200 chars of actual data
- Specific validation that failed
---
## Testing
- ✅ 583 unit tests pass
- ✅ 10 new image handling tests
- ✅ All integration tests pass
- ✅ Type checking passes
- ✅ Manually verified drag-and-drop on macOS
---
## Files Changed
```
src/components/ChatInput.tsx | 254 ++++---- (1072 → 882 lines)
src/components/ChatInputToasts.tsx | 200 +++++++ (new)
src/utils/imageHandling.test.ts | 180 +++++++ (new)
src/utils/imageHandling.ts | 95 +++++++ (new)
src/services/agentSession.ts | 17 +++--- (better errors)
```
---
## Commits
- 🤖 feat: Add drag-and-drop image support + refactor ChatInput
- 🤖 improve: Add detailed debugging info for image validation errors
- 🤖 fix: Handle drag-and-drop files with missing MIME type
_Generated with `cmux`_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
**Issue 1: Drag-drop files with empty MIME types were filtered out** - extractImagesFromDrop() rejected files where file.type === "" - This happened BEFORE the MIME type fallback could run - Fix: Also accept files with image extensions when file.type is empty - Test: Added test for macOS drag-drop scenario (empty MIME type) **Issue 2: Breaking change for existing users with saved images** - Changed from type: "image" to type: "file" in PR #308 - StreamingMessageAggregator only looked for type === "file" - Users who saved images before upgrade would lose them - Fix: Accept both "file" (new) and "image" (legacy) types - Uses type casting with eslint-disable for backwards compat Both fixes maintain backwards compatibility with existing chat history while fixing the macOS drag-and-drop bug. Codex comments: - #308 (comment) - #308 (comment)
TypeScript CI required proper type narrowing for the legacy image filter. Added type predicate (p): p is CmuxImagePart to narrow the union type.
Summary
Adds drag-and-drop support for images in the chat input, refactors ChatInput.tsx for maintainability, and fixes image handling bugs.
New Feature: Drag-and-Drop Images
Users can now drag image files directly into the chat input box alongside existing paste (Ctrl+V/Cmd+V) functionality.
How it works:
Code Refactoring
Reduced ChatInput.tsx from 1072 → 882 lines (-190 lines, -17.7%):
New Files:
src/utils/imageHandling.ts(73 lines) - Image conversion utilitiessrc/components/ChatInputToasts.tsx(200 lines) - Toast creation logicsrc/utils/imageHandling.test.ts(180 lines) - 10 comprehensive testsBenefits:
Bug Fixes
1. macOS Drag-and-Drop MIME Type Issue
Problem: Dragging PNG files from macOS Finder failed with "image part must include base64 string content"
Root Cause: macOS doesn't populate
file.typefor dragged files, resulting in empty mediaTypeSolution: Added MIME type detection fallback:
file.typeif availableimage/pngif unrecognized2. Better Error Messages for Image Validation
Old:
"image part must include base64 string content"New:
"image part [0] must include url string content (got undefined): {\"id\":\"...\"}"Errors now include:
Testing
Files Changed
Commits
Generated with
cmux