feat: detect attached editorial images, skip AI image generation by recoup-coding-agent · Pull Request #131 · recoupable/tasks

recoup-coding-agent · 2026-04-13T11:53:09Z

Summary

Adds AI-based editorial photo detection (mirrors existing face detection pattern) to classify attached images as professional press photos
When an editorial image is detected among attachments, the pipeline skips AI image generation and uses the attached image directly for video generation
Playlist cover overlays still proceed normally via ffmpeg

Changes

New: createEditorialDetectionAgent.ts — ToolLoopAgent for editorial photo classification
New: detectEditorialImage.ts — few-shot AI detection (is this a professional editorial press photo?)
Modified: classifyImages.ts — now returns editorialImageUrl alongside faceGuideUrl
Modified: resolveFaceGuide.ts — surfaces editorialImageUrl through the pipeline
Modified: createContentTask.ts — skips image generation + upscale when editorial image attached
New tests: detectEditorialImage.test.ts, classifyImages.test.ts
Updated tests: resolveFaceGuide.test.ts (new param + return field)

Test plan

All 347 tests pass
No new TypeScript errors introduced
Manual test: create content with editorial image attached → verify image gen is skipped
Manual test: create content without editorial image → verify normal pipeline unchanged
Manual test: create content with face guide + editorial + playlist cover → verify correct classification

🤖 Generated with Claude Code

Summary by cubic

Detects editorial press photos in attachments and, when found, uses them directly for video creation, skipping image generation and upscale. Replaces per-kind detectors with a single multi-class classifier and pulls base-image logic into resolveBaseImage.

New Features
- Replaced per-kind detectors with classifyImage via createImageClassificationAgent (uses IMAGE_DETECTION_MODEL) returning face_guide | editorial | additional.
- classifyImages now returns faceGuideUrl, editorialImageUrl, and additionalImageUrls; runs classification only when usesFaceGuide or usesEditorialImage is true.
- resolveFaceGuide surfaces editorialImageUrl; createContentTask uses it to skip image generation and upscaling.
- Extracted resolveBaseImage to handle editorial bypass vs. generate + optional upscale; added focused tests.
Bug Fixes
- Use original input URLs for classification to avoid provider fetch errors.
- Fixed the editorial reference URL in few-shot examples.

^{Written for commit e7a0a5d. Summary will update on new commits.}

Summary by CodeRabbit

New Features
- Added editorial image detection to identify press photos for use as image overlays, streamlining workflows by utilizing existing editorial images instead of generating new ones when applicable.
Tests
- Expanded test coverage with comprehensive test suites for editorial image detection, image classification scenarios, and face-guide resolution workflows to ensure feature reliability.

When an attached image is classified as an editorial press photo (professional portrait with cinematic lighting, no text/overlays), the pipeline now uses it directly for video generation instead of generating a new AI image. Changes: - Add createEditorialDetectionAgent for AI-based editorial photo classification - Add detectEditorialImage using few-shot prompting (mirrors detectFace pattern) - Update classifyImages to return editorialImageUrl alongside faceGuideUrl - Update resolveFaceGuide to surface editorialImageUrl through the pipeline - Update createContentTask to skip image generation when editorial image found - Add tests for detectEditorialImage, classifyImages, update resolveFaceGuide tests Co-Authored-By: Paperclip <noreply@paperclip.ing>

coderabbitai · 2026-04-13T11:53:22Z

📝 Walkthrough

Walkthrough

Added editorial-photo detection: new ToolLoopAgent factory and detection function using few‑shot prompts; integrated editorial checks into image classification and face‑guide resolution; updated content task to skip AI image generation when an editorial image is attached; tests added/updated for detection and classification flows.

Changes

Cohort / File(s)	Summary
Editorial Agent `src/agents/createEditorialDetectionAgent.ts`	New factory returning a ToolLoopAgent configured with Google Gemini, Zod schema (`isEditorial: boolean`), and single-step stop behavior.
Editorial Detection Function `src/content/detectEditorialImage.ts`	New exported `detectEditorialImage(imageUrl)` that creates the agent, sends a few‑shot prompt (example editorial + target), logs results, and returns boolean with error handling.
Image Classification `src/content/classifyImages.ts`	Added `usesImageOverlay` param; performs editorial detection when enabled; first editorial match set to `editorialImageUrl` (excluded from `additionalImageUrls`); face detection uses original image URL; return type now includes `editorialImageUrl`.
Face Guide Resolution `src/content/resolveFaceGuide.ts`	Added `editorialImageUrl` to `ResolveFaceGuideResult` and `usesImageOverlay` param; flows editorialImageUrl through classification and fallback branches; updated JSDoc.
Content Task `src/tasks/createContentTask.ts`	Step 2 now passes `usesImageOverlay` into resolveFaceGuide. Step 5: if `editorialImageUrl` present, use it directly and skip image-generation/upscale; otherwise preserve prior generation flow and upscaling logic.
Tests — New `src/content/__tests__/detectEditorialImage.test.ts`, `src/content/__tests__/classifyImages.test.ts`	New Vitest suites mocking agent and dependencies; assert few‑shot prompt structure, agent outputs, editorial selection ordering, additionalImageUrls behavior, and error cases.
Tests — Updated `src/content/__tests__/resolveFaceGuide.test.ts`	Updated to pass `usesImageOverlay: false`, expect `editorialImageUrl: null`, changed mock reset to `vi.resetAllMocks()`, and adjusted mock call counts.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ContentTask as Content Task
    participant Classify as classifyImages()
    participant Detect as detectEditorialImage()
    participant Agent as Editorial Agent
    participant Gemini as Google Gemini
    participant Gen as generateImage()

    Client->>ContentTask: create content request (images, usesImageOverlay)
    ContentTask->>Classify: classifyImages(images, usesFaceGuide, usesImageOverlay)

    loop per image
        Classify->>Detect: detectEditorialImage(imageUrl)
        Detect->>Agent: createEditorialDetectionAgent()
        Detect->>Gemini: generate(few-shot prompt + target image)
        Gemini-->>Detect: { isEditorial: true|false }
        Detect-->>Classify: boolean result

        alt isEditorial == true and none selected yet
            Classify->>Classify: set editorialImageUrl (exclude from additional)
        else
            Classify->>Classify: add to additionalImageUrls / continue
        end
    end

    Classify-->>ContentTask: { faceGuideUrl, editorialImageUrl, additionalImageUrls }

    alt editorialImageUrl exists
        ContentTask->>ContentTask: use editorialImageUrl (skip generate/upscale)
    else
        ContentTask->>Gen: generateImage(imageRefs, prompt)
        Gen-->>ContentTask: generated image (may be upscaled)
    end

    ContentTask-->>Client: created content response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: add artist-release-editorial template files #125: Also modifies image-overlay/face-guide pipeline and usesImageOverlay routing—closely related to editorial-selection flow.
fix: face-swap instruction and overlay image routing for editorial template #126: Adjusts createContentTask image routing for overlays and additionalImageUrls, overlapping content-task changes.
feat: support attached audio/image in content pipeline #116: Past changes to content/media handling and resolveFaceGuide that this PR builds upon.

Poem

🐰 I sniffed the pixels, one by one,
A few‑shot hint, the job was done.
If real press light is found today,
We skip the forge and save the day.
Hooray for photos that arrive—hooray! 📸

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely describes the main feature: detecting editorial images and skipping AI generation when found, which aligns with the PR's core objective.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/editorial-image-detection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

recoup-coding-agent · 2026-04-13T11:57:26Z

Code Review — Editorial Image Detection

Summary

This PR adds AI-based editorial photo detection to the content pipeline. When an attached image is identified as a professional editorial press photo (via Gemini Flash), it's used directly as the base image — skipping AI image generation and upscaling. Clean, focused implementation that follows existing patterns.

CI Status

Check	Status	Conclusion
`test`	completed	✅ success
`cubic`	in_progress	⏳ pending
`CodeRabbit`	pending	⏳ pending

test (the only build/CI check) passes. Remaining pending items are third-party review bots.

Branch Status

Mergeable: ✅ Yes
Merge state: unstable (due to pending review bot checks, not build failures)
Branch freshness: No merge conflicts detected

CLEAN Code Assessment

SRP ✅ — Each new file has one clear responsibility: createEditorialDetectionAgent.ts configures the AI agent, detectEditorialImage.ts handles detection logic, classifyImages.ts orchestrates classification.

OCP ✅ — Extended classifyImages and resolveFaceGuide with new parameters rather than restructuring existing logic. The editorial path is additive.

DRY ✅ — Follows the same detection pattern as detectFace (create agent → generate → parse output). Reuses existing fetchImageFromUrl and classifyImages infrastructure.

YAGNI ✅ — No over-engineering. Focused implementation with no speculative features.

Issues Found

Suggestions (non-blocking):

Preview model (google/gemini-3.1-flash-lite-preview) — The model ID includes -preview, which may be deprecated or changed without notice. Consider tracking this and switching to a GA model when available.
Single reference example — EDITORIAL_EXAMPLE_URLS has one entry. The few-shot approach works, but adding 1-2 more diverse examples (different lighting, backgrounds) could improve classification accuracy. Not blocking since the current approach is functional and error-safe.

Security

✅ No hardcoded secrets or API keys
✅ Reference image URL is a public Vercel blob — appropriate for example data
✅ Error handling returns false on failure (safe default, no information leak)
✅ Errors are logged with truncated URLs (slice(0, 80)) — no sensitive data exposure

Verdict: approve ✅

Well-structured, follows existing patterns, good test coverage (new test files + updated existing tests), and graceful error handling. The suggestions above are minor improvements for future iterations.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/content/classifyImages.ts (1)

34-53: ⚠️ Potential issue | 🟠 Major

Don't short-circuit before editorial classification.

The continue on Line 41 exits before the editorial check on Line 45, so a single attached press photo with a visible face only sets faceGuideUrl. Downstream, src/tasks/createContentTask.ts keeps generating an AI image because editorialImageUrl never gets populated.

💡 Suggested fix

   for (const imageUrl of images) {
     const uploadedUrl = await fetchImageFromUrl(imageUrl);
+    let hasFace = false;
 
     if (usesFaceGuide && !faceGuideUrl) {
-      const hasFace = await detectFace(uploadedUrl);
+      hasFace = await detectFace(uploadedUrl);
       if (hasFace) {
         faceGuideUrl = uploadedUrl;
-        continue;
       }
     }
 
     if (usesImageOverlay && !editorialImageUrl) {
       const isEditorial = await detectEditorialImage(uploadedUrl);
       if (isEditorial) {
         editorialImageUrl = uploadedUrl;
         continue;
       }
     }
+
+    if (hasFace) {
+      continue;
+    }
 
     additionalImageUrls.push(uploadedUrl);
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/content/classifyImages.ts` around lines 34 - 53, In classifyImages.ts
inside the loop that processes images (the block calling fetchImageFromUrl),
don't short-circuit with continue after detectFace; instead run both detectFace
and detectEditorialImage for the same uploadedUrl so one image can be both a
face guide and an editorial image. Concretely, in the loop that references
usesFaceGuide, faceGuideUrl, usesImageOverlay, editorialImageUrl, detectFace and
detectEditorialImage: compute hasFace and isEditorial for uploadedUrl, set
faceGuideUrl if usesFaceGuide && !faceGuideUrl && hasFace, set editorialImageUrl
if usesImageOverlay && !editorialImageUrl && isEditorial, and only push
uploadedUrl to additionalImageUrls if neither assignment happened. This
preserves the original intent while allowing a single image to populate both
faceGuideUrl and editorialImageUrl.

🧹 Nitpick comments (1)

src/content/detectEditorialImage.ts (1)
1-1: Use the Trigger.dev logger for this new classification path.

This helper adds new runtime logging through logStep, so the editorial-detection flow bypasses the standard logger the repo expects.

As per coding guidelines, "Use logger from @trigger.dev/sdk/v3 for logging".

Also applies to: 45-50
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/content/detectEditorialImage.ts` at line 1, Replace the ad-hoc logStep
usage with the repo-standard Trigger.dev logger: remove the import of logStep in
src/content/detectEditorialImage.ts and import { logger } from
"@trigger.dev/sdk/v3"; then update all calls to logStep (including the new
runtime logging at the top and the calls around the detect/editorial flow at
lines ~45-50) to use logger.info/debug/error as appropriate, preserving the
original messages and context; ensure function names like detectEditorialImage
(or any exported helpers in this file) now call logger instead of logStep and
that the import and usages compile.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/content/classifyImages.ts`:
- Around line 34-53: In classifyImages.ts inside the loop that processes images
(the block calling fetchImageFromUrl), don't short-circuit with continue after
detectFace; instead run both detectFace and detectEditorialImage for the same
uploadedUrl so one image can be both a face guide and an editorial image.
Concretely, in the loop that references usesFaceGuide, faceGuideUrl,
usesImageOverlay, editorialImageUrl, detectFace and detectEditorialImage:
compute hasFace and isEditorial for uploadedUrl, set faceGuideUrl if
usesFaceGuide && !faceGuideUrl && hasFace, set editorialImageUrl if
usesImageOverlay && !editorialImageUrl && isEditorial, and only push uploadedUrl
to additionalImageUrls if neither assignment happened. This preserves the
original intent while allowing a single image to populate both faceGuideUrl and
editorialImageUrl.

---

Nitpick comments:
In `@src/content/detectEditorialImage.ts`:
- Line 1: Replace the ad-hoc logStep usage with the repo-standard Trigger.dev
logger: remove the import of logStep in src/content/detectEditorialImage.ts and
import { logger } from "@trigger.dev/sdk/v3"; then update all calls to logStep
(including the new runtime logging at the top and the calls around the
detect/editorial flow at lines ~45-50) to use logger.info/debug/error as
appropriate, preserving the original messages and context; ensure function names
like detectEditorialImage (or any exported helpers in this file) now call logger
instead of logStep and that the import and usages compile.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0ed5ea98-2e18-43ce-bc89-8501332afcc6

📥 Commits

Reviewing files that changed from the base of the PR and between 3071b1b and b0795ab.

📒 Files selected for processing (8)

src/agents/createEditorialDetectionAgent.ts
src/content/__tests__/classifyImages.test.ts
src/content/__tests__/detectEditorialImage.test.ts
src/content/__tests__/resolveFaceGuide.test.ts
src/content/classifyImages.ts
src/content/detectEditorialImage.ts
src/content/resolveFaceGuide.ts
src/tasks/createContentTask.ts

cubic-dev-ai

No issues found across 8 files

…tion Re-uploaded fal.media URLs are brand-new and occasionally unreachable by the model provider, causing detection to fail with "Cannot fetch content from the provided URL". The original input URL is already reachable (we just downloaded from it), so use it for detection and reserve the fal upload for downstream fal.ai pipelines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sweetmantech · 2026-04-13T11:57:49Z

+ */
+export function createEditorialDetectionAgent() {
+  return new ToolLoopAgent({
+    model: "google/gemini-3.1-flash-lite-preview",


DRY - Is this the same model used in the face guide detection agent?

actual: not using shared const with other image detection agents.

required: inline string replaced with shared const model string used both here and in the face guide detection agent.

sweetmantech · 2026-04-13T11:59:27Z

 export async function classifyImages({
  images,
  usesFaceGuide,
+  usesImageOverlay,


Why is this variable named usesImageOverlay rather than usesEditorialImage?

sweetmantech · 2026-04-13T12:00:03Z

      }
    }

+    if (usesImageOverlay && !editorialImageUrl) {


KISS - why is editorialImageUrl check required here? It is initialized in this functions as null without modifications, right?

sweetmantech · 2026-04-13T12:01:20Z

+import { createEditorialDetectionAgent } from "../agents/createEditorialDetectionAgent";
+
+const EDITORIAL_EXAMPLE_URLS = [
+  "https://dxfamqbi5zyezrs5.public.blob.vercel-storage.com/content-templates/artist-release-editorial/references/images/ref-01.png",


This link is invalid.

Correct link: https://dxfamqbi5zyezrs5.public.blob.vercel-storage.com/content-attachments/image/1776085791241-Screenshot%202026-04-08%20at%203.27.37%E2%80%AFPM.png

- Extract IMAGE_DETECTION_MODEL constant shared by face + editorial agents (DRY) - Fix invalid editorial reference URL (was 404) — use the valid blob URL - Rename classify-layer param usesImageOverlay → usesEditorialImage; leave the template-level field as usesImageOverlay (it still means "overlays") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…orial detectors Consolidates the two-shot image classification scaffold (example image + target image + structured boolean output + error fallback) into a single helper. detectFace and detectEditorialImage become thin configuration wrappers. Also removes URL truncation in logs — full URLs make the logs verifiable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sweetmantech · 2026-04-13T13:47:15Z

+    if (usesEditorialImage && !editorialImageUrl) {
+      const isEditorial = await detectEditorialImage(imageUrl);
+      if (isEditorial) {
+        editorialImageUrl = uploadedUrl;
+        continue;
+      }
+    }


KISS principle: have we considered if needing 2 checks on the same image is necessary. Alternatively, how could the existing src/content/classifyImages.ts + agent files definition change to handle both cases in a single request?

Replace the per-kind binary detectors (detectFace + detectEditorialImage) with a single classifyImage that returns one of {face_guide, editorial, additional} in one Gemini call. Cuts API calls per image roughly in half and makes adding new image categories a 2-line change (enum variant + few-shot example) instead of a new agent + detection function + pipeline branch. - Remove: detectFace, detectEditorialImage, runImageFewShotClassification, createFaceDetectionAgent, createEditorialDetectionAgent (and their tests) - Add: createImageClassificationAgent (z.enum schema), classifyImage (single few-shot call with one example per positive kind) - classifyImages dispatches on the returned kind; skips classification entirely when neither flag is set Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sweetmantech · 2026-04-13T13:55:43Z

+    // --- Step 5: Generate image (API) — skip if editorial image attached ---
+    let imageUrl: string;
+
+    if (editorialImageUrl) {
+      logStep("Using attached editorial image, skipping AI image generation", true, {
+        editorialImageUrl: editorialImageUrl.slice(0, 80),
+      });
+      imageUrl = editorialImageUrl;
+    } else {
+      logStep("Generating image via API");
+      const referenceImagePath = pickRandomReferenceImage(template);
+      const instruction = resolveImageInstruction(template);
+      const basePrompt = `${instruction} ${template.imagePrompt}`;
+      const fullPrompt = buildImagePrompt(basePrompt, template.styleGuide);
+
+      const imageRefs: string[] = [];
+      if (faceGuideUrl) imageRefs.push(faceGuideUrl);
+      if (referenceImagePath) imageRefs.push(referenceImagePath);
+      if (!template.usesImageOverlay && additionalImageUrls.length) {
+        imageRefs.push(...additionalImageUrls);
+      }
+
+      imageUrl = await generateImage({
+        prompt: fullPrompt,
+        referenceImageUrl: faceGuideUrl ?? undefined,
+        images: imageRefs.length > 0 ? imageRefs : undefined,
+      });


OCP - how can we minimize the additions to the src/tasks/createContentTask.ts function. If new logic is needed, abstract it to a new function file following tdd.

Step 5-6 of the pipeline (use editorial image OR generate + optional upscale) is now a single function call in the orchestrator. New image routing logic can live in resolveBaseImage without bloating the task file. Red-green TDD with a dedicated test file covering the editorial bypass, generation path, upscale toggle, and overlay-aware imageRefs assembly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

recoup-coding-agent requested a review from sweetmantech as a code owner April 13, 2026 11:53

coderabbitai bot reviewed Apr 13, 2026

View reviewed changes

cubic-dev-ai bot reviewed Apr 13, 2026

View reviewed changes

sweetmantech reviewed Apr 13, 2026

View reviewed changes

sweetmantech and others added 2 commits April 13, 2026 08:28

sweetmantech reviewed Apr 13, 2026

View reviewed changes

sweetmantech approved these changes Apr 13, 2026

View reviewed changes

sweetmantech merged commit a0523c5 into main Apr 13, 2026
2 checks passed

sweetmantech deleted the feature/editorial-image-detection branch April 13, 2026 14:03

Conversation

recoup-coding-agent commented Apr 13, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

recoup-coding-agent commented Apr 13, 2026

Code Review — Editorial Image Detection

Summary

CI Status

Branch Status

CLEAN Code Assessment

Issues Found

Security

Verdict: approve ✅

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sweetmantech Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

recoup-coding-agent commented Apr 13, 2026 •

edited by cubic-dev-ai bot

Loading

coderabbitai bot commented Apr 13, 2026 •

edited

Loading