Fix trimming handling on image arrays#331
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements image limiting functionality to prevent context overflow from multimodal tokens and improves error handling for unresolved media placeholders. It also introduces configurable stream timeout settings to better support long-running LLM requests.
- Added image limiting logic to cap tool images at 6 and user images at 15 across all messages
- Implemented error recovery for unresolved media placeholders to prevent LLM errors
- Made stream timeout configurable via environment variable with a default of 10 minutes
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
chat_completion_client.go |
Refactored to support configurable stream timeout via functional options pattern |
message_trimmer.go |
Added new LimitImagesInMessages function to cap image counts by role (tool/user) |
chat_handler.go |
Integrated image limiting into message processing pipeline and added stripUnresolvedMediaPlaceholders for error recovery |
inference_provider.go (both files) |
Updated constructors to accept config and pass stream timeout to chat completion clients |
resolver.go |
Enhanced logging to extract and display placeholder IDs during media resolution |
config.go |
Added StreamTimeout configuration field with 600s default |
wire_gen.go |
Updated dependency injection to pass config to InferenceProvider constructors |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // LimitImagesInMessages limits the number of images in messages to prevent context overflow. | ||
| // It keeps the most recent images and removes older ones. | ||
| // - maxToolImages: maximum images across all tool messages (default: MaxToolImages = 10) |
There was a problem hiding this comment.
The documentation comment states "default: MaxToolImages = 10" but the actual constant value defined on line 53 is 6, not 10. This inconsistency between documentation and code will confuse users of this function.
| // - maxToolImages: maximum images across all tool messages (default: MaxToolImages = 10) | |
| // - maxToolImages: maximum images across all tool messages (default: MaxToolImages = 6) |
| // This is a stringified JSON that wasn't parsed - try to clean it | ||
| match := janMediaPlaceholderPattern.FindString(msg.Content) | ||
| log.Warn(). | ||
| Int("message_index", i). | ||
| Str("role", msg.Role). | ||
| Str("placeholder_id", match). | ||
| Msg("found unresolved placeholder in content field, attempting to clean") | ||
| strippedIDs = append(strippedIDs, match) | ||
| strippedCount++ |
There was a problem hiding this comment.
The code detects unresolved placeholders in the Content field and logs them, but it never actually removes or cleans the placeholder from the content. The message with the unresolved placeholder is still added to the result (line 784), which means the placeholder will still be sent to the LLM, potentially causing the "Non-base64 digit found" error that this function is supposed to prevent.
The code should either:
- Remove the placeholder from msg.Content (e.g., using regex replacement)
- Replace msg.Content with a fallback message like "[Image could not be loaded]"
- Skip adding this message to the result
The current implementation only logs the issue but doesn't fix it.
| // This is a stringified JSON that wasn't parsed - try to clean it | |
| match := janMediaPlaceholderPattern.FindString(msg.Content) | |
| log.Warn(). | |
| Int("message_index", i). | |
| Str("role", msg.Role). | |
| Str("placeholder_id", match). | |
| Msg("found unresolved placeholder in content field, attempting to clean") | |
| strippedIDs = append(strippedIDs, match) | |
| strippedCount++ | |
| // This is a stringified JSON that wasn't parsed - replace it with a safe fallback | |
| match := janMediaPlaceholderPattern.FindString(msg.Content) | |
| log.Warn(). | |
| Int("message_index", i). | |
| Str("role", msg.Role). | |
| Str("placeholder_id", match). | |
| Msg("found unresolved placeholder in content field, replacing with fallback text") | |
| strippedIDs = append(strippedIDs, match) | |
| strippedCount++ | |
| // Ensure unresolved placeholder content is not sent to the LLM | |
| newMsg.Content = "[Image could not be loaded]" |
| } | ||
|
|
||
| // Third, limit images to prevent context overflow from multimodal tokens | ||
| // Tool messages: max 10 images, User messages: max 15 images |
There was a problem hiding this comment.
The comment states "Tool messages: max 10 images, User messages: max 15 images" but according to the constants defined in message_trimmer.go (lines 53-54), MaxToolImages is 6, not 10. This inconsistency between the comment and the actual constant values will mislead developers reading this code.
| // Tool messages: max 10 images, User messages: max 15 images | |
| // Tool messages: max 6 images, User messages: max 15 images |
| func LimitImagesInMessages(messages []openai.ChatCompletionMessage) []openai.ChatCompletionMessage { | ||
| log := logger.GetLogger() | ||
|
|
||
| // Create a deep copy to avoid modifying the original | ||
| result := make([]openai.ChatCompletionMessage, len(messages)) | ||
| for i := range messages { | ||
| result[i] = messages[i] | ||
| // Deep copy MultiContent if present | ||
| if len(messages[i].MultiContent) > 0 { | ||
| result[i].MultiContent = make([]openai.ChatMessagePart, len(messages[i].MultiContent)) | ||
| copy(result[i].MultiContent, messages[i].MultiContent) | ||
| } | ||
| } | ||
|
|
||
| // Count images by role, tracking message index and part index for each image | ||
| var toolImages []imageLocation | ||
| var userImages []imageLocation | ||
|
|
||
| for i := len(result) - 1; i >= 0; i-- { | ||
| msg := &result[i] | ||
| if len(msg.MultiContent) == 0 { | ||
| continue | ||
| } | ||
|
|
||
| for j := len(msg.MultiContent) - 1; j >= 0; j-- { | ||
| part := msg.MultiContent[j] | ||
| if part.Type == openai.ChatMessagePartTypeImageURL && part.ImageURL != nil && part.ImageURL.URL != "" { | ||
| loc := imageLocation{messageIdx: i, partIdx: j} | ||
| if msg.Role == "tool" { | ||
| toolImages = append(toolImages, loc) | ||
| } else if msg.Role == "user" { | ||
| userImages = append(userImages, loc) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Reverse to get oldest first (we collected newest first by iterating backwards) | ||
| reverseImageLocations(toolImages) | ||
| reverseImageLocations(userImages) | ||
|
|
||
| toolImagesRemoved := 0 | ||
| userImagesRemoved := 0 | ||
|
|
||
| // Remove excess tool images (oldest first) | ||
| if len(toolImages) > MaxToolImages { | ||
| excessCount := len(toolImages) - MaxToolImages | ||
| for i := 0; i < excessCount; i++ { | ||
| loc := toolImages[i] | ||
| result[loc.messageIdx].MultiContent = removePartAtIndex(result[loc.messageIdx].MultiContent, loc.partIdx) | ||
| // Adjust indices for subsequent removals in the same message | ||
| for j := i + 1; j < len(toolImages); j++ { | ||
| if toolImages[j].messageIdx == loc.messageIdx && toolImages[j].partIdx > loc.partIdx { | ||
| toolImages[j].partIdx-- | ||
| } | ||
| } | ||
| toolImagesRemoved++ | ||
| } | ||
| } | ||
|
|
||
| // Remove excess user images (oldest first) | ||
| if len(userImages) > MaxUserImages { | ||
| excessCount := len(userImages) - MaxUserImages | ||
| for i := 0; i < excessCount; i++ { | ||
| loc := userImages[i] | ||
| result[loc.messageIdx].MultiContent = removePartAtIndex(result[loc.messageIdx].MultiContent, loc.partIdx) | ||
| // Adjust indices for subsequent removals in the same message | ||
| for j := i + 1; j < len(userImages); j++ { | ||
| if userImages[j].messageIdx == loc.messageIdx && userImages[j].partIdx > loc.partIdx { | ||
| userImages[j].partIdx-- | ||
| } | ||
| } | ||
| userImagesRemoved++ | ||
| } | ||
| } | ||
|
|
||
| if toolImagesRemoved > 0 || userImagesRemoved > 0 { | ||
| log.Info(). | ||
| Int("tool_images_before", len(toolImages)). | ||
| Int("tool_images_removed", toolImagesRemoved). | ||
| Int("tool_images_after", len(toolImages)-toolImagesRemoved). | ||
| Int("user_images_before", len(userImages)). | ||
| Int("user_images_removed", userImagesRemoved). | ||
| Int("user_images_after", len(userImages)-userImagesRemoved). | ||
| Msg("limited images in messages to prevent context overflow") | ||
| } | ||
|
|
||
| return result | ||
| } |
There was a problem hiding this comment.
The new LimitImagesInMessages function lacks test coverage. Given that this function implements complex logic for removing images with index adjustments across multiple messages, and there are critical bugs in the index adjustment logic (see related comments), unit tests are essential to ensure correctness. Tests should cover:
- Messages with multiple images in different roles (tool vs user)
- Multiple images in the same message that need removal
- Edge cases like empty messages, messages with only images, and boundary conditions for the max image limits
- Verification that the correct oldest images are removed while keeping the newest ones
| func stripUnresolvedMediaPlaceholders(messages []openai.ChatCompletionMessage) []openai.ChatCompletionMessage { | ||
| log := logger.GetLogger() | ||
| result := make([]openai.ChatCompletionMessage, 0, len(messages)) | ||
| strippedCount := 0 | ||
| var strippedIDs []string | ||
|
|
||
| for i, msg := range messages { | ||
| newMsg := msg | ||
|
|
||
| // Check MultiContent for image parts with placeholders | ||
| if len(msg.MultiContent) > 0 { | ||
| filteredParts := make([]openai.ChatMessagePart, 0, len(msg.MultiContent)) | ||
| for _, part := range msg.MultiContent { | ||
| if part.Type == openai.ChatMessagePartTypeImageURL && part.ImageURL != nil { | ||
| // Check if URL contains a jan_* placeholder | ||
| if janMediaPlaceholderPattern.MatchString(part.ImageURL.URL) { | ||
| // Extract the placeholder ID for logging | ||
| match := janMediaPlaceholderPattern.FindString(part.ImageURL.URL) | ||
| log.Warn(). | ||
| Int("message_index", i). | ||
| Str("role", msg.Role). | ||
| Str("placeholder_id", match). | ||
| Str("full_url", part.ImageURL.URL). | ||
| Msg("stripping unresolved image placeholder from message") | ||
| strippedIDs = append(strippedIDs, match) | ||
| strippedCount++ | ||
| continue // Skip this part | ||
| } | ||
| } | ||
| filteredParts = append(filteredParts, part) | ||
| } | ||
| newMsg.MultiContent = filteredParts | ||
|
|
||
| // If all parts were stripped, add a placeholder text to avoid empty message | ||
| if len(filteredParts) == 0 && len(msg.MultiContent) > 0 { | ||
| newMsg.MultiContent = []openai.ChatMessagePart{ | ||
| { | ||
| Type: openai.ChatMessagePartTypeText, | ||
| Text: "[Image could not be loaded]", | ||
| }, | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Check Content field for embedded jan_* placeholders (rare, but possible) | ||
| if msg.Content != "" && janMediaPlaceholderPattern.MatchString(msg.Content) && strings.Contains(msg.Content, "data:image") { | ||
| // This is a stringified JSON that wasn't parsed - try to clean it | ||
| match := janMediaPlaceholderPattern.FindString(msg.Content) | ||
| log.Warn(). | ||
| Int("message_index", i). | ||
| Str("role", msg.Role). | ||
| Str("placeholder_id", match). | ||
| Msg("found unresolved placeholder in content field, attempting to clean") | ||
| strippedIDs = append(strippedIDs, match) | ||
| strippedCount++ | ||
| } | ||
|
|
||
| result = append(result, newMsg) | ||
| } | ||
|
|
||
| if strippedCount > 0 { | ||
| log.Warn(). | ||
| Int("stripped_count", strippedCount). | ||
| Strs("stripped_ids", strippedIDs). | ||
| Msg("stripped unresolved media placeholders from messages - images not found in media-api database") | ||
| } | ||
|
|
||
| return result | ||
| } |
There was a problem hiding this comment.
The new stripUnresolvedMediaPlaceholders function lacks test coverage. This function handles error recovery by stripping unresolved media placeholders to prevent LLM errors. Test cases should verify:
- Correctly identifies and strips jan_* placeholders from image URLs in MultiContent
- Handles messages with mixed content (some images with placeholders, some without)
- Adds fallback text when all parts are stripped
- Correctly handles placeholders in Content field (though there's a bug here - see related comment)
- Edge cases like empty messages, messages with no placeholders, and various placeholder formats
| // Create a deep copy to avoid modifying the original | ||
| result := make([]openai.ChatCompletionMessage, len(messages)) | ||
| for i := range messages { | ||
| result[i] = messages[i] | ||
| // Deep copy MultiContent if present | ||
| if len(messages[i].MultiContent) > 0 { | ||
| result[i].MultiContent = make([]openai.ChatMessagePart, len(messages[i].MultiContent)) | ||
| copy(result[i].MultiContent, messages[i].MultiContent) | ||
| } |
There was a problem hiding this comment.
The comment claims to "Create a deep copy" but this is actually a shallow copy of the ChatMessagePart structs within MultiContent. The ImageURL pointers within ChatMessagePart are shared between the original and the copy. While this is fine for the current use case (only removing array elements, not modifying ImageURL fields), the comment is misleading. Consider either clarifying the comment to say "Create a copy of the messages and MultiContent arrays to avoid modifying the original" or actually implementing a true deep copy if needed.
| // Remove excess tool images (oldest first) | ||
| if len(toolImages) > MaxToolImages { | ||
| excessCount := len(toolImages) - MaxToolImages | ||
| for i := 0; i < excessCount; i++ { | ||
| loc := toolImages[i] | ||
| result[loc.messageIdx].MultiContent = removePartAtIndex(result[loc.messageIdx].MultiContent, loc.partIdx) | ||
| // Adjust indices for subsequent removals in the same message | ||
| for j := i + 1; j < len(toolImages); j++ { | ||
| if toolImages[j].messageIdx == loc.messageIdx && toolImages[j].partIdx > loc.partIdx { | ||
| toolImages[j].partIdx-- | ||
| } | ||
| } | ||
| toolImagesRemoved++ | ||
| } | ||
| } |
There was a problem hiding this comment.
The sequential removal and index adjustment has a critical bug when multiple images from the same message need to be removed. The issue is that removePartAtIndex modifies the actual slice in the result array, but the index adjustments only propagate forward in the toolImages array.
Consider this scenario: Message 3 has tool images at partIdx [1, 3, 5]. If we need to remove the two oldest (at indices 1 and 3), after processing:
- Iteration i=0: Remove index 1, adjust future indices. Now toolImages[1].partIdx becomes 2 (was 3)
- Iteration i=1: Remove index 2 (adjusted from 3), but toolImages[2].partIdx is still 5, which should now be 4
However, there's a deeper issue: The adjustment happens within the iteration loop, so when we do result[loc.messageIdx].MultiContent = removePartAtIndex(...), we're modifying the actual message, but then trying to adjust indices in the locations array. This creates a desynchronization between the actual state of the message and the indices we're tracking.
A safer approach: Sort the removal locations by messageIdx, then by partIdx in descending order, and remove from highest index to lowest. This way, removing a higher index doesn't affect lower indices.
| // Remove excess user images (oldest first) | ||
| if len(userImages) > MaxUserImages { | ||
| excessCount := len(userImages) - MaxUserImages | ||
| for i := 0; i < excessCount; i++ { | ||
| loc := userImages[i] | ||
| result[loc.messageIdx].MultiContent = removePartAtIndex(result[loc.messageIdx].MultiContent, loc.partIdx) | ||
| // Adjust indices for subsequent removals in the same message | ||
| for j := i + 1; j < len(userImages); j++ { | ||
| if userImages[j].messageIdx == loc.messageIdx && userImages[j].partIdx > loc.partIdx { | ||
| userImages[j].partIdx-- | ||
| } | ||
| } | ||
| userImagesRemoved++ | ||
| } | ||
| } |
There was a problem hiding this comment.
Same critical bug as in the tool images removal section. When multiple user images from the same message need to be removed, the sequential removal and index adjustment approach creates a desynchronization between the actual message state and the tracked indices. The adjustment logic (lines 757-761) attempts to fix indices forward in the array, but this doesn't correctly handle all cases of multiple removals from the same message.
For example, if message 5 has user images at partIdx [2, 4, 7, 9] and we need to remove the three oldest, after removing index 2, the subsequent indices in that message become [3, 6, 8], but the adjustment logic may not correctly propagate all these changes through all remaining items to be removed.
Fix trimming handling on image arrays