Skip to content

Add display_image MCP tool for inline images#700

Merged
tiann merged 2 commits into
tiann:mainfrom
NightWatcher314:feat/display-image-tool
May 27, 2026
Merged

Add display_image MCP tool for inline images#700
tiann merged 2 commits into
tiann:mainfrom
NightWatcher314:feat/display-image-tool

Conversation

@NightWatcher314
Copy link
Copy Markdown
Contributor

Summary

  • add a HAPI MCP display_image tool that registers a local image as an existing generated-image chat payload
  • forward display_image through the stdio MCP bridge
  • update Codex, Claude, and OpenCode prompt hints so agents know to call the tool for local images

Refs #697

Test

  • bun run typecheck:cli

@NightWatcher314
Copy link
Copy Markdown
Contributor Author

图片

@tiann tiann force-pushed the feat/display-image-tool branch from a8aebf7 to 76f15ef Compare May 26, 2026 23:14
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Do not trust caller-supplied image MIME type — display_image accepts mimeType from the MCP caller and uses it before extension detection. The generated-image route later streams the registered file bytes to the web UI with that MIME type, so any local file under 25 MB can be registered as image/png even when the path is not an image. That makes this auto-allowed HAPI MCP tool a generic local-file bridge instead of an image display tool. Evidence cli/src/claude/utils/startHappyServer.ts:113.
    Suggested fix:
    const displayImageInputSchema: z.ZodTypeAny = z.object({
        path: z.string().describe('Local filesystem path of the image to display to the user'),
        title: z.string().optional().describe('Optional display title or filename for the image'),
    });
    
    const mimeType = resolveGeneratedImageMimeType(args.path);
    if (!mimeType.startsWith('image/')) {
        throw new Error(`Unsupported image file type for path: ${args.path}`);
    }

Summary

  • Review mode: initial
  • One issue found: the new MCP image display path needs to derive or verify MIME type from the actual file instead of trusting tool arguments. Residual risk: no automated test covers rejection of non-image paths or MIME overrides.

Testing

  • Not run (automation)

HAPI Bot

Comment thread cli/src/claude/utils/startHappyServer.ts Outdated
@NightWatcher314 NightWatcher314 force-pushed the feat/display-image-tool branch from 76f15ef to 9a99a56 Compare May 26, 2026 23:45
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Validate image bytes, not just the path extension — display_image calls stat(args.path), which follows symlinks, then derives image/png/etc. from the original path string. A path like /tmp/leak.png can point at any non-image file under 25 MB and still be registered; the generated-image endpoint later reads that target and serves its bytes as an inline image response. Evidence cli/src/claude/utils/startHappyServer.ts:112.
    Suggested fix:
    const info = await lstat(args.path);
    if (!info.isFile()) {
        throw new Error("Path is not a regular file");
    }
    
    const bytes = await readFile(args.path);
    const mimeType = detectImageMimeType(bytes);
    if (!mimeType) {
        throw new Error("Unsupported image content");
    }
    
    const image = registerGeneratedImage({
        id: randomUUID(),
        path: args.path,
        fileName: args.title,
        mimeType
    });

Summary

  • Review mode: follow-up after new commits
  • One issue found: the prior caller-supplied MIME issue is reduced, but the new tool still accepts extension/symlink spoofing instead of validating image content. Residual risk: tests cover the helper override case, but not display_image rejection for non-image bytes or symlink paths.

Testing

  • Not run (automation)

HAPI Bot

throw new Error('Image is too large to display inline');
}

const mimeType = resolveGeneratedImageMimeType(args.path);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] display_image still validates the path name instead of the file content. stat(args.path) follows symlinks, while resolveGeneratedImageMimeType(args.path) trusts the original path extension, so a path like /tmp/leak.png can point at any non-image file under 25 MB and still be registered; the generated-image endpoint later reads and serves that target as an inline image response.

Suggested fix:

const info = await lstat(args.path);
if (!info.isFile()) {
    throw new Error("Path is not a regular file");
}

const bytes = await readFile(args.path);
const mimeType = detectImageMimeType(bytes);
if (!mimeType) {
    throw new Error("Unsupported image content");
}

const image = registerGeneratedImage({
    id: randomUUID(),
    path: args.path,
    fileName: args.title,
    mimeType
});

The mimeType here should be derived by the validation helper, not accepted from MCP arguments.

@NightWatcher314 NightWatcher314 force-pushed the feat/display-image-tool branch from 9a99a56 to 7c1a6d1 Compare May 26, 2026 23:56
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Snapshot validated image bytes before serving — display_image validates the current file bytes, but then registers only args.path; the generated-image RPC reads that path later when the browser fetches the image. A file can be swapped after validation, bypassing both maxImageBytes and detectImageMimeType, and the endpoint will serve the replacement under the previously trusted MIME. Evidence cli/src/claude/utils/startHappyServer.ts:120 and related read path cli/src/modules/common/handlers/files.ts:65.
    Suggested fix:
    export type GeneratedImageMetadata = {
        id: string
        fileName: string
        mimeType: string
        content: Buffer
        createdAt: number
    }
    
    export function registerGeneratedImage(args: {
        id: string
        path: string
        fileName?: string | null
        mimeType: string
        bytes: Uint8Array
    }): GeneratedImageMetadata {
        const metadata = {
            id: args.id,
            fileName: args.fileName || basename(args.path) || `${args.id}.png`,
            mimeType: args.mimeType,
            content: Buffer.from(args.bytes),
            createdAt: Date.now()
        }
        generatedImages.set(args.id, metadata)
        return metadata
    }
    
    // callers, after detectImageMimeType(bytes) succeeds
    registerGeneratedImage({ id, path, fileName, mimeType, bytes })
    
    // readGeneratedImage handler
    return {
        success: true,
        content: image.content.toString('base64'),
        mimeType: image.mimeType,
        fileName: image.fileName
    }

Questions

  • None.

Summary

  • Review mode: follow-up after new commits
  • One issue found: the prior MIME-validation problem is mostly addressed, but the new display path still serves a mutable filesystem path after validation instead of the validated bytes. Residual risk: tests cover MIME sniffing, but not replacement between registration and fetch.

Testing

  • Not run (automation: bun is not installed in this runner, so bun run typecheck:cli and bun test cli/src/modules/common/generatedImages.test.ts could not execute)

HAPI Bot


const image = registerGeneratedImage({
id: randomUUID(),
path: args.path,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] display_image validates args.path now, but the registry still stores only that mutable path. The generated-image RPC later reads image.path on demand, so a file can be swapped after validation and before the browser fetch, bypassing both the size cap and MIME sniffing while still being served with the previously trusted MIME.

Suggested fix:

export type GeneratedImageMetadata = {
    id: string
    fileName: string
    mimeType: string
    content: Buffer
    createdAt: number
}

export function registerGeneratedImage(args: {
    id: string
    path: string
    fileName?: string | null
    mimeType: string
    bytes: Uint8Array
}): GeneratedImageMetadata {
    const metadata = {
        id: args.id,
        fileName: args.fileName || basename(args.path) || `${args.id}.png`,
        mimeType: args.mimeType,
        content: Buffer.from(args.bytes),
        createdAt: Date.now()
    }
    generatedImages.set(args.id, metadata)
    return metadata
}

// callers, after detectImageMimeType(bytes) succeeds
registerGeneratedImage({ id, path, fileName, mimeType, bytes })

// readGeneratedImage handler
return {
    success: true,
    content: image.content.toString('base64'),
    mimeType: image.mimeType,
    fileName: image.fileName
}

@NightWatcher314 NightWatcher314 force-pushed the feat/display-image-tool branch from 7c1a6d1 to 1295cc0 Compare May 27, 2026 00:04
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Bound generated-image bytes kept in memory — registerGeneratedImage now snapshots each image with Buffer.from(args.bytes) into the module-level generatedImages map, but nothing evicts entries or caps total retained bytes. Since display_image allows up to 25 MB per call, a long session or repeated tool calls can grow the CLI process without bound and eventually crash it. Evidence cli/src/modules/common/generatedImages.ts:61.
    Suggested fix:
    const MAX_GENERATED_IMAGE_BYTES = 25 * 1024 * 1024
    const MAX_GENERATED_IMAGE_COUNT = 100
    let generatedImageBytes = 0
    
    export function registerGeneratedImage(args: { id: string; path: string; mimeType: string; bytes: Uint8Array; fileName?: string | null }): GeneratedImageMetadata {
        const content = Buffer.from(args.bytes)
        if (content.byteLength > MAX_GENERATED_IMAGE_BYTES) {
            throw new Error('Image is too large to display inline')
        }
    
        const previous = generatedImages.get(args.id)
        if (previous) generatedImageBytes -= previous.content.byteLength
    
        const metadata: GeneratedImageMetadata = {
            id: args.id,
            fileName: args.fileName || basename(args.path) || `${args.id}.png`,
            content,
            mimeType: args.mimeType,
            createdAt: Date.now()
        }
        generatedImages.set(args.id, metadata)
        generatedImageBytes += content.byteLength
    
        while (generatedImages.size > MAX_GENERATED_IMAGE_COUNT) {
            const oldestId = generatedImages.keys().next().value
            if (!oldestId) break
            const oldest = generatedImages.get(oldestId)
            if (oldest) generatedImageBytes -= oldest.content.byteLength
            generatedImages.delete(oldestId)
        }
        return metadata
    }
    
    export function clearGeneratedImages(): void {
        generatedImages.clear()
        generatedImageBytes = 0
    }

Questions

  • None.

Summary

  • Review mode: follow-up after new commits
  • The prior validation/snapshotting issue is addressed, but the snapshot storage needs a bounded retention policy. Residual risk: the new MCP display_image path has no integration test covering repeated registrations or eviction behavior.

Testing

  • Not run (automation: bun is not installed in this runner, so bun run typecheck:cli could not execute)

HAPI Bot

path: args.path,
fileName: args.fileName || basename(args.path) || `${args.id}.png`,
mimeType: args.mimeType || resolveGeneratedImageMimeType(args.path),
content: Buffer.from(args.bytes),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] registerGeneratedImage now snapshots every image into the process-wide generatedImages map, but there is no eviction or total byte cap. With display_image allowing up to 25 MB per call, repeated image displays in a long-lived session can grow the CLI process without bound and crash it.

Suggested fix:

const MAX_GENERATED_IMAGE_BYTES = 25 * 1024 * 1024
const MAX_GENERATED_IMAGE_COUNT = 100
let generatedImageBytes = 0

export function registerGeneratedImage(args: { id: string; path: string; mimeType: string; bytes: Uint8Array; fileName?: string | null }): GeneratedImageMetadata {
    const content = Buffer.from(args.bytes)
    if (content.byteLength > MAX_GENERATED_IMAGE_BYTES) {
        throw new Error('Image is too large to display inline')
    }

    const previous = generatedImages.get(args.id)
    if (previous) generatedImageBytes -= previous.content.byteLength

    const metadata: GeneratedImageMetadata = {
        id: args.id,
        fileName: args.fileName || basename(args.path) || `${args.id}.png`,
        content,
        mimeType: args.mimeType,
        createdAt: Date.now()
    }
    generatedImages.set(args.id, metadata)
    generatedImageBytes += content.byteLength

    while (generatedImages.size > MAX_GENERATED_IMAGE_COUNT) {
        const oldestId = generatedImages.keys().next().value
        if (!oldestId) break
        const oldest = generatedImages.get(oldestId)
        if (oldest) generatedImageBytes -= oldest.content.byteLength
        generatedImages.delete(oldestId)
    }
    return metadata
}

export function clearGeneratedImages(): void {
    generatedImages.clear()
    generatedImageBytes = 0
}

@NightWatcher314 NightWatcher314 force-pushed the feat/display-image-tool branch from 1295cc0 to 47566a9 Compare May 27, 2026 00:11
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Preserve generated-image event order — the new void (async () => { ... })() starts image registration outside the Codex notification flow, so the handler immediately continues to later events like agent_message, task_complete, and the ready timer while a large image is still being read. That can render the generated image after the text/completion that followed it, or after the UI has already been marked ready. Evidence cli/src/codex/codexRemoteLauncher.ts:2216.
    Suggested fix:
    let codexEventQueue = Promise.resolve()
    
    const handleCodexEvent = async (msg: Record<string, unknown>): Promise<void> => {
        // existing event handling...
        if (msgType === 'generated_image') {
            const sourceImageId = asString(msg.image_id ?? msg.imageId ?? msg.id)
            const imageId = randomUUID()
            const savedPath = asString(msg.saved_path ?? msg.savedPath)
            if (!savedPath) return
    
            const image = await registerGeneratedImageFromPath({
                id: imageId,
                path: savedPath,
                fileName: asString(msg.file_name ?? msg.fileName)
            })
            if (!image) return
    
            messageBuffer.addMessage(`Generated image: ${image.fileName}`, 'assistant')
            session.sendAgentMessage({
                type: 'generated-image',
                imageId: image.id,
                sourceImageId,
                fileName: image.fileName,
                mimeType: image.mimeType,
                id: randomUUID()
            })
            return
        }
    }
    
    appServerClient.setNotificationHandler((method, params) => {
        const events = appServerEventConverter.handleNotification(method, params)
        for (const event of events) {
            const eventRecord = asRecord(event) ?? { type: undefined }
            codexEventQueue = codexEventQueue
                .then(() => handleCodexEvent(eventRecord))
                .catch((error) => logger.debug('[Codex] Failed to handle app-server event:', error))
        }
    })

Questions

  • None.

Summary

  • Review mode: follow-up after new commits
  • One ordering issue remains in the Codex generated-image path. The prior memory-retention finding appears addressed by the new count/byte eviction policy. Residual risk: no integration test covers generated-image event ordering around following agent/task events.

Testing

  • Not run (automation; PR code not executed during review)

HAPI Bot

Comment thread cli/src/codex/codexRemoteLauncher.ts Outdated
mimeType: image.mimeType,
id: randomUUID()
});
void (async () => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] This detaches image registration from the Codex event stream. While readFile is pending, the handler continues processing later notifications such as following agent_message, task_complete, and the ready timer, so a large generated image can show up after the text/completion that came after it.

Suggested fix:

let codexEventQueue = Promise.resolve()

const handleCodexEvent = async (msg: Record<string, unknown>): Promise<void> => {
    // existing event handling...
    if (msgType === 'generated_image') {
        const sourceImageId = asString(msg.image_id ?? msg.imageId ?? msg.id)
        const imageId = randomUUID()
        const savedPath = asString(msg.saved_path ?? msg.savedPath)
        if (!savedPath) return

        const image = await registerGeneratedImageFromPath({
            id: imageId,
            path: savedPath,
            fileName: asString(msg.file_name ?? msg.fileName)
        })
        if (!image) return

        messageBuffer.addMessage(`Generated image: ${image.fileName}`, 'assistant')
        session.sendAgentMessage({
            type: 'generated-image',
            imageId: image.id,
            sourceImageId,
            fileName: image.fileName,
            mimeType: image.mimeType,
            id: randomUUID()
        })
        return
    }
}

appServerClient.setNotificationHandler((method, params) => {
    const events = appServerEventConverter.handleNotification(method, params)
    for (const event of events) {
        const eventRecord = asRecord(event) ?? { type: undefined }
        codexEventQueue = codexEventQueue
            .then(() => handleCodexEvent(eventRecord))
            .catch((error) => logger.debug('[Codex] Failed to handle app-server event:', error))
    }
})

@NightWatcher314 NightWatcher314 force-pushed the feat/display-image-tool branch from 47566a9 to a1c95c5 Compare May 27, 2026 00:19
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings
None.

Questions
None.

Summary
Review mode: follow-up after new commits
No issues found in the latest full diff. Residual risk: no end-to-end test covers a display_image MCP call through the generated-image fetch route; the added unit tests cover MIME detection, snapshotting, size limit, and eviction.

Testing
Ran git diff --check (passed). Not run: bun run typecheck:cli / tests (bun is not installed in this runner: /bin/bash: bun: command not found).

HAPI Bot

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings
None.

Questions
None.

Summary
Review mode: follow-up after new commits
No issues found in the latest full diff. Residual risk: there is still no end-to-end coverage for a display_image MCP call flowing through CLI registration, hub RPC image fetch, and web rendering; the added unit tests cover MIME sniffing, snapshotting, size limit, and count eviction.

Testing
Ran git diff --check origin/main...HEAD (passed). Not run: bun run typecheck:cli / bun run test:cli because bun is not installed in this runner.

HAPI Bot

@tiann tiann merged commit de5dc97 into tiann:main May 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants