Skip to content

feat: render low-res image previews for attachments (tmux/zellij-safe) #46

@ezynda3

Description

@ezynda3

Feature Description

Today Kit represents attached images purely as a text indicator (a "pill"). For example, when clipboard images are pending, the input component renders only:

[N image(s) attached] ctrl+u to clear

This is rendered in internal/ui/input.go:547-556. There is no visual preview of the actual image content.

The request is to render a low-resolution, in-terminal preview of attached images (a small thumbnail) in place of / alongside the text pill, using a technique that survives terminal multiplexers (tmux, zellij).

Hard constraint: the Kitty graphics protocol is explicitly out of scope (along with Sixel and iTerm2 inline images), because those are graphics escape-sequence protocols that tmux/zellij strip or mangle by default.

Motivation / Use Case

  • The pi agent shows real image previews, while Kit only shows a text placeholder — this is a UX gap when working with screenshots/diagrams.
  • Users frequently paste screenshots from the clipboard (core.ImageAttachment, internal/ui/core/events.go:6) and attach image files via @file references (FilePart, internal/ui/fileutil/processor.go:18). A glanceable thumbnail confirms the right image was attached before sending.
  • Most users run inside tmux/zellij, so the rendering technique must not depend on graphics escape sequences.

Research notes (why not graphics protocols)

pi (badlogic/pi-mono, TypeScript) uses the Kitty graphics protocol + iTerm2 inline images and explicitly disables images under tmux (only re-enabled for Ghostty, which forwards graphics through tmux). That confirms graphics-protocol approaches are the wrong fit for a tmux/zellij-first tool.

The only technique that reliably works inside tmux and zellij is rendering the image as ordinary colored text (SGR color codes + printable Unicode), because the multiplexer treats it like any other styled text.

Proposed Implementation

Rendering technique: Unicode half-blocks + truecolor

Stack two vertical pixels per character cell using (U+2580, upper half block): foreground = top pixel color, background = bottom pixel color. With truecolor SGR (\x1b[38;2;r;g;b / \x1b[48;2;r;g;b) this doubles vertical resolution and produces pure text output that passes through tmux/zellij untouched (verified with a local proof-of-concept).

Suggested Go packages

  1. github.com/eliukblau/pixterm/pkg/ansimage (preferred) — v1.3.2 (Sept 2024), MIT. Importable sub-package (no CLI dependency). Relevant API:
    • NewScaledFromReader(r io.Reader, y, x int, bg color.Color, sm ScaleMode, dm DitheringMode) (*ANSImage, error)
    • (*ANSImage).Render() string — returns half-block ANSI ready to drop into a lipgloss box.
    • Lightweight deps: disintegration/imaging, lucasb-eyer/go-colorful, golang.org/x/image.
  2. github.com/qeesung/image2ascii — fallback for non-truecolor terminals (ASCII / 256-color).
  3. Hand-rolled (~40 LOC) — trivial half-block loop + golang.org/x/image/draw for downscaling; avoids a dependency and gives full control over theming/fallbacks.

Packages to avoid for this constraint: BourgeoisBear/rasterm (kitty/iterm2/sixel only) and mattn/go-sixel (needs tmux allow-passthrough; unreliable in zellij).

Proposed helper

Add an internal/ui helper, e.g.:

// internal/ui/imagepreview/imagepreview.go
//
// Render returns a half-block ANSI thumbnail of the image, scaled to fit
// within maxCols x maxRows terminal cells. Falls back to 256-color, then to
// an empty string (caller keeps the text pill) when truecolor is unavailable.
func Render(data []byte, mediaType string, maxCols, maxRows int, bg color.Color) (string, error)

Wiring points

  • Clipboard images: internal/ui/input.go:547-556 currently renders the [N image(s) attached] pill. Replace/augment with a cached thumbnail for each pendingImages entry (internal/ui/input.go:81, type core.ImageAttachment at internal/ui/core/events.go:6, fields Data []byte + MediaType string).
  • @file image attachments: FilePart (internal/ui/fileutil/processor.go:18, populated around processor.go:114 and :160); binary detection already keys off strings.HasPrefix(mediaType, "image/") (processor.go:292).
  • Message history images: message.ImageContent (internal/message/content.go:97, fields Data []byte / MediaType string; accessor Message.Images() at content.go:176) for previews in rendered transcript blocks.

Implementation notes / gotchas

  • Truecolor detection: half-block fidelity needs 24-bit color. Gate on COLORTERM=truecolor/24bit; degrade to 256-color, then to the existing text pill. (Style/terminal detection lives in internal/ui/style/.)
  • Sizing: cap previews to a small cell box (e.g. ~40x20 cells) for the low-res look and to keep scrollback light.
  • Bubble Tea: render the thumbnail string once when the attachment is added and cache it — never re-render per Update()/frame; the output is a static styled string.
  • Width math: measure with len([]rune(s)), not bytes — each is multi-byte (per AGENTS.md Unicode note).

Environment

  • Repo: mark3labs/kit
  • Commit: ae722d52
  • Component: UI (internal/ui), specifically input attachment rendering and message rendering

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions