Skip to content

feat(presentation): embed images in generated decks (#3209)#3299

Merged
senamakel merged 8 commits into
tinyhumansai:mainfrom
oxoxDev:feat/3209-presentation-images
Jun 4, 2026
Merged

feat(presentation): embed images in generated decks (#3209)#3299
senamakel merged 8 commits into
tinyhumansai:mainfrom
oxoxDev:feat/3209-presentation-images

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented Jun 3, 2026

Summary

  • generate_presentation now embeds images in slides, not just text — each slide accepts an images array alongside title/body/bullets.
  • Two image sources: artifact (bytes from a prior tool's output) and file (a local path). Remote URLs are deferred (SSRF surface).
  • Images are validated (PNG/JPEG only, ≤5 MB each, ≤6/slide, ≤8/deck) and laid out single-column beneath the slide text via ppt-rs's SlideContent::add_image.
  • A bad image (wrong type, oversize, unreadable) is skipped with a warning — the deck still renders. Warnings surface in the tool result (image_warnings) and the markdown reply.
  • New artifacts::read_artifact_bytes is the single source of truth for resolving an artifact id → on-disk bytes.
  • Orchestrator prompt documents the images arg and a grounding rule (don't claim what an image shows without verifying it).

Problem

Decks produced by generate_presentation (added in #3016, parent #2778) were text-only. Real presentations need charts, screenshots, and diagrams. ppt-rs 0.2.14 can embed images, so the agent should be able to attach them — sourced from artifacts it produced earlier or from local files — without a separate editing step.

Solution

  • Data model (types.rs): SlideImage { source, caption } with SlideImageSource::{Artifact{artifact_id}, File{path}} (internally tagged). New per-deck / per-slide / per-image caps. #[serde(deny_unknown_fields)] preserved.
  • Resolution at the async boundary (mod.rs::resolve_images): artifact → read_artifact_bytes; file → size-checked local read. Validated via a self-contained PNG/JPEG sniffer + pixel-dimension reader (image_util.rs). Failures become skip-warnings, never hard errors.
  • Placement (engine.rs): resolved (bytes, format, dims) are scaled aspect-preserving and stacked single-column in the slide's lower band, attached after the text. Caption renders as a trailing bullet.
  • Why a self-contained image util instead of reusing agent::multimodal's helpers: those are private + outside this change's boundary, they accept webp/gif/bmp (which ppt-rs 0.2.14 cannot embed — no webp Content-Type default, auto misclassifies), and they don't expose pixel dimensions (needed for placement). v1 therefore restricts embeddable formats to PNG + JPEG.

Scope cuts (stated, not silent): URL source deferred (SSRF); webp/gif deferred (ppt-rs can't embed safely in 0.2.14); single-column layout only (multi-image grid deferred). No ppt-rs bump needed.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy — unit (sniff/dims, validation, placement, fit-within), integration (PNG round-trip → ppt/media/image1.png + Content-Types), and tool-level skip-with-warning + cap-reject paths.
  • Diff coverage ≥ 80% — Rust-only change; added unit + integration tests exercise every new/changed line (resolve, validate, placement, sniff, read_artifact_bytes). CI diff-cover is authoritative.
  • N/A: coverage matrix — additive image support to the existing generate_presentation feature; no new/removed/renamed feature row.
  • N/A: no matrix feature-ID change, so none to list under Related.
  • No new external network dependencies introduced — image bytes come from local artifacts/files only; no network fetch (URL source deferred).
  • N/A: not a release-cut smoke surface (core tool behaviour, no UI/installer change).
  • Linked issue closed via Closes #3209 in the ## Related section.

Impact

  • Platform: core Rust tool (generate_presentation); no src-tauri / frontend change. Runs in-process like before.
  • Security: image bytes resolved only from workspace artifacts or local file paths; no remote fetch. Per-image 5 MB cap enforced (file source stat-checked before read). Artifact reads go through the existing sandboxed artifacts path validation.
  • Compatibility: purely additive — images defaults to empty, so existing text-only callers and the tool's input/output schema for non-image use are unchanged.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A (GitHub-issue-driven, not Linear)
  • URL: N/A

Commit & Branch

  • Branch: feat/3209-presentation-images
  • Commit SHA: 0754dc66908b62fe1371dd4cfa6cc433f0aaeff7

Validation Run

  • N/A: pnpm --filter openhuman-app format:check — no frontend changes.
  • N/A: pnpm typecheck — no frontend changes.
  • Focused tests: cargo test --lib openhuman::tools::implementations::presentation → 28 passed, 0 failed.
  • Rust fmt/check: cargo fmt --check clean, cargo clippy -p openhuman --all-targets -- -D warnings clean, cargo check -p openhuman clean.
  • N/A: Tauri fmt/check — no app/src-tauri changes.

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: generate_presentation can now embed PNG/JPEG images per slide from artifact or file sources.
  • User-visible effect: generated .pptx decks contain images beneath slide text; skipped images are reported back to the agent/user.

Parity Contract

  • Legacy behavior preserved: text-only decks render identically (images field defaults empty; title-slide + content-slide shape and slide_count semantics unchanged).
  • Guard/fallback/dispatch parity checks: bad-image path degrades to skip-with-warning (partial success), never failing a deck that would previously have succeeded.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none known
  • Canonical PR: this
  • Resolution: N/A

Summary by CodeRabbit

  • New Features
    • Embed PNG/JPEG images in presentations from local files or prior artifacts, with optional captions; images appear in a single-column stack beneath slide text and a synthetic title slide is added.
  • Bug Fixes / Behavior
    • Invalid images are skipped while generation proceeds; human-readable image_warnings report skipped-image reasons.
  • Limits
    • Enforced per-image, per-slide, and per-deck size/count limits; rejects overly large decks.
  • Documentation
    • Updated agent guidance documenting image embedding workflow and validation rules.

@oxoxDev oxoxDev requested a review from a team June 3, 2026 12:23
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 81976bcf-98fd-45af-aa4e-aff49c538c4b

📥 Commits

Reviewing files that changed from the base of the PR and between 0754dc6 and 6c5ade0.

📒 Files selected for processing (3)
  • src/openhuman/tools/impl/presentation/mod.rs
  • src/openhuman/tools/impl/presentation/tests.rs
  • src/openhuman/tools/ops.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/tools/impl/presentation/tests.rs
  • src/openhuman/tools/impl/presentation/mod.rs

📝 Walkthrough

Walkthrough

Adds per-slide image support to generate_presentation: new types/limits, artifact/file byte resolution, PNG/JPEG sniffing and dimension extraction, engine layout and embedding with aspect-ratio scaling, SecurityPolicy wiring, re-exported artifact byte helper, updated tests, and orchestrator prompt docs.

Changes

Image Embedding for Presentations

Layer / File(s) Summary
Image type contracts and validation
src/openhuman/tools/impl/presentation/types.rs
Adds MAX_IMAGES_PER_SLIDE, MAX_IMAGES_PER_DECK, MAX_IMAGE_BYTES; SlideImageSource and SlideImage; extends SlideSpec.images; adds ResolvedSlideImage and GeneratePresentationOutput.image_warnings; updates validate_input for image counts and field validation.
Artifact byte reading infrastructure
src/openhuman/artifacts/store.rs, src/openhuman/artifacts/mod.rs
Adds read_artifact_bytes(workspace_dir, artifact_id) that enforces ArtifactStatus::Ready, resolves metadata path within artifacts root, reads bytes, logs length, and re-exports it.
Image format detection and pixel dimensions
src/openhuman/tools/impl/presentation/image_util.rs
Adds sniff_format (PNG/JPEG by magic bytes) and pixel_dimensions with PNG IHDR parsing and JPEG SOF scanning; rejects unsupported formats/truncated headers; includes unit tests.
Tool-level image resolution and wiring
src/openhuman/tools/impl/presentation/mod.rs
PresentationTool now stores a SecurityPolicy and constructor requires it; parameters_schema extended for per-slide images; implements resolve_images/resolve_one_image to read from artifacts/files (with security validation), enforce byte/format limits, sniff format, extract dimensions, produce ResolvedSlideImage, collect image_warnings, and pass resolved images to engine.
Presentation engine image layout and embedding
src/openhuman/tools/impl/presentation/engine.rs
generate signature updated to accept resolved images; build_slides attaches images after text; introduces PlacedImage, place_single_column, and fit_within to scale/center images in a single-column lower image band; updates blocking pptx generation and tests to pass images.
Tool-level image embedding integration tests
src/openhuman/tools/impl/presentation/tests.rs
Adds test_security and make_tool helpers, png_1x1, payload_of, pptx_entry_names, and async tests for file-image embedding (asserts ppt/media/image1.png), unsupported MIME warnings, oversize-image warnings, missing-artifact warnings, and deck-wide image-count rejection.
Orchestrator agent prompt documentation
src/openhuman/agent_registry/agents/orchestrator/prompt.md
Adds "Presentations with images" guidance: images array shape, supported artifact and file sources (no remote URLs), format/size limits, skip-and-warn behavior with image_warnings, grounding rule for captions, and v1 single-column stacked layout.
Tool registry wiring
src/openhuman/tools/ops.rs
Passes security.clone() into PresentationTool::new(...) when registering the tool.

Sequence Diagram(s)

sequenceDiagram
  participant Agent
  participant Tool as PresentationTool
  participant ImageRes as resolve_images
  participant ArtifactStore as artifacts::store
  participant ImageUtil as image_util
  participant Engine as engine::generate
  participant PPTRs as ppt-rs

  Agent->>Tool: execute(input with slide images)
  Tool->>ImageRes: resolve_images per slide
  ImageRes->>ArtifactStore: read_artifact_bytes (artifact source)
  ImageRes->>Tool: validate file path via SecurityPolicy (file source)
  ImageRes->>ImageUtil: sniff_format(bytes)
  ImageUtil-->>ImageRes: PNG or JPEG
  ImageRes->>ImageUtil: pixel_dimensions(bytes, format)
  ImageUtil-->>ImageRes: (width, height)
  ImageRes-->>Tool: Vec<Vec<ResolvedSlideImage>> (skip failures with warnings)
  Tool->>Engine: generate(input, resolved_images, deadline)
  Engine->>Engine: build_slides (attach images)
  Engine->>Engine: place_single_column, fit_within (layout & scale)
  Engine->>PPTRs: Image::from_bytes(...).add_to(slide)
  PPTRs-->>Engine: slide with embedded image
  Engine-->>Tool: pptx bytes
  Tool-->>Agent: result (artifact, image_warnings)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2801: Introduced artifact persistence and metadata model (ArtifactStatus, get_artifact) that read_artifact_bytes extends.
  • tinyhumansai/openhuman#3016: Original generate_presentation implementation extended here to add per-slide image embedding and engine wiring.

Suggested labels

agent

Suggested reviewers

  • senamakel
  • graycyrus

Poem

🐰 I hopped through bytes and slides today,

PNGs tucked neatly beneath the say,
Captions checked, dimensions true,
Stacked in a column, not askew—
A rabbit's deck, now bright and gay.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(presentation): embed images in generated decks' directly and clearly summarizes the main change—adding image embedding functionality to the presentation generation tool.
Linked Issues check ✅ Passed The PR fully implements all coding objectives from issue #3209: SlideSpec extended with images field, image resolution from artifacts/files with validation, embedding via ppt-rs, non-fatal skip-with-warning behavior, tests for media entries and validation, and orchestrator prompt updated.
Out of Scope Changes check ✅ Passed All changes are scoped to image embedding in presentation generation. No unrelated refactoring, dependency bumps, or out-of-scope features; remote URL fetch and WebP/GIF support were explicitly deferred per issue scope.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. labels Jun 3, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/tools/impl/presentation/mod.rs`:
- Around line 366-381: The File branch in SlideImageSource::File currently reads
arbitrary filesystem paths without validation; before calling
tokio::fs::metadata or tokio::fs::read, run the path through the existing Rust
validators (is_path_string_allowed and/or validate_path) and return an Err if
validation fails; use the same `path` variable and produce a clear error string
(e.g., "file {path} not allowed" or the validate_path message) so no filesystem
I/O occurs for disallowed paths and permitted paths proceed to metadata/read as
before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 010027c5-ddac-483b-bfa4-43cad0dd9027

📥 Commits

Reviewing files that changed from the base of the PR and between a4f07b1 and 0754dc6.

📒 Files selected for processing (8)
  • src/openhuman/agent_registry/agents/orchestrator/prompt.md
  • src/openhuman/artifacts/mod.rs
  • src/openhuman/artifacts/store.rs
  • src/openhuman/tools/impl/presentation/engine.rs
  • src/openhuman/tools/impl/presentation/image_util.rs
  • src/openhuman/tools/impl/presentation/mod.rs
  • src/openhuman/tools/impl/presentation/tests.rs
  • src/openhuman/tools/impl/presentation/types.rs

Comment thread src/openhuman/tools/impl/presentation/mod.rs
…humansai#3209)

CodeRabbit: File-source image reads bypassed the path-permission model.
Route agent-supplied paths through SecurityPolicy::validate_path (inject
Arc<SecurityPolicy> at construction, matching the browser/image_info tools)
so a path like /etc/shadow or ~/.ssh/id_rsa is rejected before any I/O.
@coderabbitai coderabbitai Bot added the agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. label Jun 3, 2026
@senamakel senamakel merged commit 679d463 into tinyhumansai:main Jun 4, 2026
23 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

generate_presentation: image embedding in slides via ppt-rs::add_image

2 participants