Add display_image MCP tool for inline images#700
Conversation
a8aebf7 to
76f15ef
Compare
There was a problem hiding this comment.
Findings
- [Major] Do not trust caller-supplied image MIME type —
display_imageacceptsmimeTypefrom the MCP caller and uses it before extension detection. The generated-image route later streams the registered file bytes to the web UI with that MIME type, so any local file under 25 MB can be registered asimage/pngeven when the path is not an image. That makes this auto-allowed HAPI MCP tool a generic local-file bridge instead of an image display tool. Evidencecli/src/claude/utils/startHappyServer.ts:113.
Suggested fix:const displayImageInputSchema: z.ZodTypeAny = z.object({ path: z.string().describe('Local filesystem path of the image to display to the user'), title: z.string().optional().describe('Optional display title or filename for the image'), }); const mimeType = resolveGeneratedImageMimeType(args.path); if (!mimeType.startsWith('image/')) { throw new Error(`Unsupported image file type for path: ${args.path}`); }
Summary
- Review mode: initial
- One issue found: the new MCP image display path needs to derive or verify MIME type from the actual file instead of trusting tool arguments. Residual risk: no automated test covers rejection of non-image paths or MIME overrides.
Testing
- Not run (automation)
HAPI Bot
76f15ef to
9a99a56
Compare
There was a problem hiding this comment.
Findings
- [Major] Validate image bytes, not just the path extension —
display_imagecallsstat(args.path), which follows symlinks, then derivesimage/png/etc. from the original path string. A path like/tmp/leak.pngcan point at any non-image file under 25 MB and still be registered; the generated-image endpoint later reads that target and serves its bytes as an inline image response. Evidencecli/src/claude/utils/startHappyServer.ts:112.
Suggested fix:const info = await lstat(args.path); if (!info.isFile()) { throw new Error("Path is not a regular file"); } const bytes = await readFile(args.path); const mimeType = detectImageMimeType(bytes); if (!mimeType) { throw new Error("Unsupported image content"); } const image = registerGeneratedImage({ id: randomUUID(), path: args.path, fileName: args.title, mimeType });
Summary
- Review mode: follow-up after new commits
- One issue found: the prior caller-supplied MIME issue is reduced, but the new tool still accepts extension/symlink spoofing instead of validating image content. Residual risk: tests cover the helper override case, but not
display_imagerejection for non-image bytes or symlink paths.
Testing
- Not run (automation)
HAPI Bot
| throw new Error('Image is too large to display inline'); | ||
| } | ||
|
|
||
| const mimeType = resolveGeneratedImageMimeType(args.path); |
There was a problem hiding this comment.
[MAJOR] display_image still validates the path name instead of the file content. stat(args.path) follows symlinks, while resolveGeneratedImageMimeType(args.path) trusts the original path extension, so a path like /tmp/leak.png can point at any non-image file under 25 MB and still be registered; the generated-image endpoint later reads and serves that target as an inline image response.
Suggested fix:
const info = await lstat(args.path);
if (!info.isFile()) {
throw new Error("Path is not a regular file");
}
const bytes = await readFile(args.path);
const mimeType = detectImageMimeType(bytes);
if (!mimeType) {
throw new Error("Unsupported image content");
}
const image = registerGeneratedImage({
id: randomUUID(),
path: args.path,
fileName: args.title,
mimeType
});The mimeType here should be derived by the validation helper, not accepted from MCP arguments.
9a99a56 to
7c1a6d1
Compare
There was a problem hiding this comment.
Findings
- [Major] Snapshot validated image bytes before serving —
display_imagevalidates the current file bytes, but then registers onlyargs.path; the generated-image RPC reads that path later when the browser fetches the image. A file can be swapped after validation, bypassing bothmaxImageBytesanddetectImageMimeType, and the endpoint will serve the replacement under the previously trusted MIME. Evidencecli/src/claude/utils/startHappyServer.ts:120and related read pathcli/src/modules/common/handlers/files.ts:65.
Suggested fix:export type GeneratedImageMetadata = { id: string fileName: string mimeType: string content: Buffer createdAt: number } export function registerGeneratedImage(args: { id: string path: string fileName?: string | null mimeType: string bytes: Uint8Array }): GeneratedImageMetadata { const metadata = { id: args.id, fileName: args.fileName || basename(args.path) || `${args.id}.png`, mimeType: args.mimeType, content: Buffer.from(args.bytes), createdAt: Date.now() } generatedImages.set(args.id, metadata) return metadata } // callers, after detectImageMimeType(bytes) succeeds registerGeneratedImage({ id, path, fileName, mimeType, bytes }) // readGeneratedImage handler return { success: true, content: image.content.toString('base64'), mimeType: image.mimeType, fileName: image.fileName }
Questions
- None.
Summary
- Review mode: follow-up after new commits
- One issue found: the prior MIME-validation problem is mostly addressed, but the new display path still serves a mutable filesystem path after validation instead of the validated bytes. Residual risk: tests cover MIME sniffing, but not replacement between registration and fetch.
Testing
- Not run (automation:
bunis not installed in this runner, sobun run typecheck:cliandbun test cli/src/modules/common/generatedImages.test.tscould not execute)
HAPI Bot
|
|
||
| const image = registerGeneratedImage({ | ||
| id: randomUUID(), | ||
| path: args.path, |
There was a problem hiding this comment.
[MAJOR] display_image validates args.path now, but the registry still stores only that mutable path. The generated-image RPC later reads image.path on demand, so a file can be swapped after validation and before the browser fetch, bypassing both the size cap and MIME sniffing while still being served with the previously trusted MIME.
Suggested fix:
export type GeneratedImageMetadata = {
id: string
fileName: string
mimeType: string
content: Buffer
createdAt: number
}
export function registerGeneratedImage(args: {
id: string
path: string
fileName?: string | null
mimeType: string
bytes: Uint8Array
}): GeneratedImageMetadata {
const metadata = {
id: args.id,
fileName: args.fileName || basename(args.path) || `${args.id}.png`,
mimeType: args.mimeType,
content: Buffer.from(args.bytes),
createdAt: Date.now()
}
generatedImages.set(args.id, metadata)
return metadata
}
// callers, after detectImageMimeType(bytes) succeeds
registerGeneratedImage({ id, path, fileName, mimeType, bytes })
// readGeneratedImage handler
return {
success: true,
content: image.content.toString('base64'),
mimeType: image.mimeType,
fileName: image.fileName
}7c1a6d1 to
1295cc0
Compare
There was a problem hiding this comment.
Findings
- [Major] Bound generated-image bytes kept in memory —
registerGeneratedImagenow snapshots each image withBuffer.from(args.bytes)into the module-levelgeneratedImagesmap, but nothing evicts entries or caps total retained bytes. Sincedisplay_imageallows up to 25 MB per call, a long session or repeated tool calls can grow the CLI process without bound and eventually crash it. Evidencecli/src/modules/common/generatedImages.ts:61.
Suggested fix:const MAX_GENERATED_IMAGE_BYTES = 25 * 1024 * 1024 const MAX_GENERATED_IMAGE_COUNT = 100 let generatedImageBytes = 0 export function registerGeneratedImage(args: { id: string; path: string; mimeType: string; bytes: Uint8Array; fileName?: string | null }): GeneratedImageMetadata { const content = Buffer.from(args.bytes) if (content.byteLength > MAX_GENERATED_IMAGE_BYTES) { throw new Error('Image is too large to display inline') } const previous = generatedImages.get(args.id) if (previous) generatedImageBytes -= previous.content.byteLength const metadata: GeneratedImageMetadata = { id: args.id, fileName: args.fileName || basename(args.path) || `${args.id}.png`, content, mimeType: args.mimeType, createdAt: Date.now() } generatedImages.set(args.id, metadata) generatedImageBytes += content.byteLength while (generatedImages.size > MAX_GENERATED_IMAGE_COUNT) { const oldestId = generatedImages.keys().next().value if (!oldestId) break const oldest = generatedImages.get(oldestId) if (oldest) generatedImageBytes -= oldest.content.byteLength generatedImages.delete(oldestId) } return metadata } export function clearGeneratedImages(): void { generatedImages.clear() generatedImageBytes = 0 }
Questions
- None.
Summary
- Review mode: follow-up after new commits
- The prior validation/snapshotting issue is addressed, but the snapshot storage needs a bounded retention policy. Residual risk: the new MCP
display_imagepath has no integration test covering repeated registrations or eviction behavior.
Testing
- Not run (automation:
bunis not installed in this runner, sobun run typecheck:clicould not execute)
HAPI Bot
| path: args.path, | ||
| fileName: args.fileName || basename(args.path) || `${args.id}.png`, | ||
| mimeType: args.mimeType || resolveGeneratedImageMimeType(args.path), | ||
| content: Buffer.from(args.bytes), |
There was a problem hiding this comment.
[MAJOR] registerGeneratedImage now snapshots every image into the process-wide generatedImages map, but there is no eviction or total byte cap. With display_image allowing up to 25 MB per call, repeated image displays in a long-lived session can grow the CLI process without bound and crash it.
Suggested fix:
const MAX_GENERATED_IMAGE_BYTES = 25 * 1024 * 1024
const MAX_GENERATED_IMAGE_COUNT = 100
let generatedImageBytes = 0
export function registerGeneratedImage(args: { id: string; path: string; mimeType: string; bytes: Uint8Array; fileName?: string | null }): GeneratedImageMetadata {
const content = Buffer.from(args.bytes)
if (content.byteLength > MAX_GENERATED_IMAGE_BYTES) {
throw new Error('Image is too large to display inline')
}
const previous = generatedImages.get(args.id)
if (previous) generatedImageBytes -= previous.content.byteLength
const metadata: GeneratedImageMetadata = {
id: args.id,
fileName: args.fileName || basename(args.path) || `${args.id}.png`,
content,
mimeType: args.mimeType,
createdAt: Date.now()
}
generatedImages.set(args.id, metadata)
generatedImageBytes += content.byteLength
while (generatedImages.size > MAX_GENERATED_IMAGE_COUNT) {
const oldestId = generatedImages.keys().next().value
if (!oldestId) break
const oldest = generatedImages.get(oldestId)
if (oldest) generatedImageBytes -= oldest.content.byteLength
generatedImages.delete(oldestId)
}
return metadata
}
export function clearGeneratedImages(): void {
generatedImages.clear()
generatedImageBytes = 0
}1295cc0 to
47566a9
Compare
There was a problem hiding this comment.
Findings
- [Major] Preserve generated-image event order — the new
void (async () => { ... })()starts image registration outside the Codex notification flow, so the handler immediately continues to later events likeagent_message,task_complete, and the ready timer while a large image is still being read. That can render the generated image after the text/completion that followed it, or after the UI has already been marked ready. Evidencecli/src/codex/codexRemoteLauncher.ts:2216.
Suggested fix:let codexEventQueue = Promise.resolve() const handleCodexEvent = async (msg: Record<string, unknown>): Promise<void> => { // existing event handling... if (msgType === 'generated_image') { const sourceImageId = asString(msg.image_id ?? msg.imageId ?? msg.id) const imageId = randomUUID() const savedPath = asString(msg.saved_path ?? msg.savedPath) if (!savedPath) return const image = await registerGeneratedImageFromPath({ id: imageId, path: savedPath, fileName: asString(msg.file_name ?? msg.fileName) }) if (!image) return messageBuffer.addMessage(`Generated image: ${image.fileName}`, 'assistant') session.sendAgentMessage({ type: 'generated-image', imageId: image.id, sourceImageId, fileName: image.fileName, mimeType: image.mimeType, id: randomUUID() }) return } } appServerClient.setNotificationHandler((method, params) => { const events = appServerEventConverter.handleNotification(method, params) for (const event of events) { const eventRecord = asRecord(event) ?? { type: undefined } codexEventQueue = codexEventQueue .then(() => handleCodexEvent(eventRecord)) .catch((error) => logger.debug('[Codex] Failed to handle app-server event:', error)) } })
Questions
- None.
Summary
- Review mode: follow-up after new commits
- One ordering issue remains in the Codex generated-image path. The prior memory-retention finding appears addressed by the new count/byte eviction policy. Residual risk: no integration test covers generated-image event ordering around following agent/task events.
Testing
- Not run (automation; PR code not executed during review)
HAPI Bot
| mimeType: image.mimeType, | ||
| id: randomUUID() | ||
| }); | ||
| void (async () => { |
There was a problem hiding this comment.
[MAJOR] This detaches image registration from the Codex event stream. While readFile is pending, the handler continues processing later notifications such as following agent_message, task_complete, and the ready timer, so a large generated image can show up after the text/completion that came after it.
Suggested fix:
let codexEventQueue = Promise.resolve()
const handleCodexEvent = async (msg: Record<string, unknown>): Promise<void> => {
// existing event handling...
if (msgType === 'generated_image') {
const sourceImageId = asString(msg.image_id ?? msg.imageId ?? msg.id)
const imageId = randomUUID()
const savedPath = asString(msg.saved_path ?? msg.savedPath)
if (!savedPath) return
const image = await registerGeneratedImageFromPath({
id: imageId,
path: savedPath,
fileName: asString(msg.file_name ?? msg.fileName)
})
if (!image) return
messageBuffer.addMessage(`Generated image: ${image.fileName}`, 'assistant')
session.sendAgentMessage({
type: 'generated-image',
imageId: image.id,
sourceImageId,
fileName: image.fileName,
mimeType: image.mimeType,
id: randomUUID()
})
return
}
}
appServerClient.setNotificationHandler((method, params) => {
const events = appServerEventConverter.handleNotification(method, params)
for (const event of events) {
const eventRecord = asRecord(event) ?? { type: undefined }
codexEventQueue = codexEventQueue
.then(() => handleCodexEvent(eventRecord))
.catch((error) => logger.debug('[Codex] Failed to handle app-server event:', error))
}
})47566a9 to
a1c95c5
Compare
There was a problem hiding this comment.
Findings
None.
Questions
None.
Summary
Review mode: follow-up after new commits
No issues found in the latest full diff. Residual risk: no end-to-end test covers a display_image MCP call through the generated-image fetch route; the added unit tests cover MIME detection, snapshotting, size limit, and eviction.
Testing
Ran git diff --check (passed). Not run: bun run typecheck:cli / tests (bun is not installed in this runner: /bin/bash: bun: command not found).
HAPI Bot
There was a problem hiding this comment.
Findings
None.
Questions
None.
Summary
Review mode: follow-up after new commits
No issues found in the latest full diff. Residual risk: there is still no end-to-end coverage for a display_image MCP call flowing through CLI registration, hub RPC image fetch, and web rendering; the added unit tests cover MIME sniffing, snapshotting, size limit, and count eviction.
Testing
Ran git diff --check origin/main...HEAD (passed). Not run: bun run typecheck:cli / bun run test:cli because bun is not installed in this runner.
HAPI Bot

Summary
display_imagetool that registers a local image as an existinggenerated-imagechat payloaddisplay_imagethrough the stdio MCP bridgeRefs #697
Test
bun run typecheck:cli